As
part of the IT Architecture Initiative, the Office of Information Technologies
(OIT) is producing a series of papers outlining directions in information
technology architecture.
In the
spirit of RFCs, the papers are intended to facilitate understanding
of and open dialogue about information technology trends at Cornell,
with the ultimate goal of improving the utilization and interoperability
of information technology services throughout Cornell.
pdf version
E-mail
at Cornell: To serve or not to serve
Prepared by R.
David Vernon
03/02
SYNOPSIS
The goal
of this paper is to illuminate current and alternative directions the
campus might pursue for maintaining central e-mail service(s). The paper
explores the
Current
status of mail services at Cornell
Impact of
mobile computing on mail use
Dawn of
nonCornell-supplied mail services
Issues of
identity, content control, and privacy
Alternatives
Focus points
Closing
thoughts
Current
Status of Mail Services at Cornell
Cornell
manages electronic mail via a mix of departmentally controlled and central
mail services. Department systems are primarily based on Microsoft's
Exchange server whereas Cornell Information Technologies (CIT) uses
Carnegie Melon's Cyrus mail server software (1). In
total, Cornell delivers approximately one million mail messages per
day with the principle volume being handled by CIT mail services. Although
these numbers may seem large, the volume is not out of line with the
mail volume seen at other major universities.
In aggregate,
the Exchange and Cyrus mail servers have the potential to support three
principal mail protocols. These protocols are
- Post
Office Protocol (POP)
- Internet
Messaging Application Protocol (IMAP)
- Messaging
Application Programming Interface (MAPI)--a proprietary protocol
for Microsoft Exchange servers
The above
protocols are required to structure the transfer of stored mail messages
between a server (Cyrus or Exchange) and the software client on the
desktop. These protocols should not be confused with Simple Mail Transfer
Protocol (SMTP) (2) used to exchange messages between
Internet post offices/mail servers, or with mail message structure standards
such as MIME (Multipurpose Internet Mail Extensions) (3).
For the purpose of this paper we are going to focus on mail server services--those
systems that allow the storage and exchange of stored mail with individual
mail clients, such as Microsoft's Outlook, Netscape, and Eudora. This
paper does not address the broader issue of messaging standards or how
messages are exchanged across the Internet.
There are
fundamental differences between the types of mail services each protocol
can support. It is important to understand these differences when exploring
alternative mail service directions.
Post Office
Protocol (POP) is simply a set of rules that allows a mail client to
communicate with a mail server so that a user can access his mail messages.
It is an open protocol standard outlined in a series of Request for
Comments (RFCs) with POP3 being the standard revision in use today (4).
Any mail client supporting POP can receive mail from any server supporting
POP.
POP, however,
is limited in function when compared to newer mail protocols such as
Internet Message Access Protocol (IMAP). POP is limited to facilitating
the downloading of messages the POP mail server caches for a given client.
POP users can leave previously read messages on the server, but there
is no means for a mail client to use the server to store copies of sent
messages. This limitation is often seen as awkward for mail users who
leverage more than one client and want access to both inbound and outbound
copies of sent mail. Simply stated, POP was designed assuming most users
would have a single mail client, and that this mail client would act
as the storage device for permanent retention of inbound and outbound
mail.
Like POP,
Internet Message Access Protocol (IMAP) is a set of rules for exchanging
mail messages between a mail client and a mail server. Unlike POP, IMAP
allows the sending of local data, such as copies of sent messages, to
the mail server for storage. This server-centric model allows access
of all information desired by a given user regardless of the unique
number of client machines used (5).
Microsoft's
Messaging Application Programming Interface (MAPI) is a proprietary
protocol enabled by Microsoft's mail servers and Microsoft's mail client
applications. MAPI, like IMAP, allows a server-centric mail-service
paradigm. MAPI was designed to support more complex functions than the
exchange of relatively simple mail information; it facilitates the broad
exchange of information between myriad applications: for example, the
exchange of data between a spreadsheet application and a mail client
application running within a given Microsoft operating system. A common
use of MAPI between clients and a central server is to support the exchange
of both mail and calendar data, a function highly valued by select Cornell
schools and departments. Regardless, the extra utility the protocol
offers must be balanced against its proprietary, Microsoftcontrolled
server applications.
WEB mail
services are servers that allow users to send and read mail via a web
client, such as Netscape. WEB mail systems are not technically the same
as the protocols outlined above--but they are worth mentioning to afford
broad understanding. Many WEB mail services are really simple gateways
to POP/IMAP/MAPI mail servers. A central WEB mail gateway receives and
transmits requests to browsers using standard browser protocols. In
turn, the WEB gateway sends POP/IMAP/MAPI requests to the mail server.
The value of WEB gateways is that they eliminate the need for dedicated
mail clients and allow users access to mail via the almost ubiquitous
presence of browsers on every personal computer platform.
Impact
of Mobile Computing
With every
passing day, the way users access information evolves. It is not at
all uncommon for users to have and use more than one device to access
their e-mail. These devices range from cell phones to desktop computers.
Many use laptops in one work scenario and a fixed desktop computer in
another. This evolving model coupled with more diverse and flexible
networking [for example, wireless (6)] changes the
basic demands placed on a mail service.
Once there
was a rather strong notion that mail services should "leverage" the
distributed storage implied by the distributed ownership of desktop
computing. Mail, once delivered as a service on a central time-shared
computer accessed via dumb terminals, was replaced by a new distributed
mail service paradigm based primarily on the POP protocol. The idea
was simple. Use a central mail service only as an always on cache to
collect inbound mail. In turn, users would periodically access this
cache and download their mail to their local disk drive. This process
leveraged the disk space attached to personally owned workstations while
providing a robust and "always on" cache for inbound mail. Although
it is possible for a user on POP-based services to leave mail on the
server, Cornell's policy of automatically deleting mail on the server
after 60 days was intended to force users to download information to
their local disk as the primary record.
This model
was arguably effective for traditional mail use, but it is seen by many
as problematic for evolving-use paradigms. These problems include an
- Inability
to access downloaded messages from another network attached device,
such as a cell phone, PDA (personal digital assistant), or laptop
computer.
- Inability
to place outbound messages (sent messages) on a server so they can
be reviewed by another network-attached device.
When using
more than one device to access mail, many people like to review not
only new messages but also message that were sent and reviewed in the
past. The current Cornell policy of deleting mail on servers and the
general POP service paradigm does not afford users this option. In addition,
the POP protocol does not support saving copies of outbound messages
on the central mail server. In aggregate these two short-comings--forced
deletion of inbound mail and lack of support for outbound mail copies
on the mail server--make Cornell's central mail service limited in scope
when compared to our peer institutions or locally deployed Microsoft
Exchange alternatives.
The
Dawn of Non-Cornell-Provided E-mail Services
Almost
everyone is now aware of the "free" e-mail services provided on the
Internet. Hotmail and Yahoo mail are but two of note. These services
are designed to be universally accessible (regardless of device) and
offer basic services, for free, far beyond the limited service suite
offered at Cornell. Anyone walking the campus will see a growing number
of students using these mail systems, and their use is likely to continue
to grow.
Understanding
that Cornell's current central mail systems are limited in scope, old
in function and design, and lacking the desired features of free mail
services offered outside the institution, the logical question is begged:
Should Cornell eliminate central mail services? Many consider the support
cost for these systems high, projected at $400,000 dollars over the
next fiscal year. In fact, a growing discussion at some higher education
institutions contends there is no need for central mail services, and
open movements exist to eliminate such services. Nevertheless, such
a movement may not be in Cornell's best interest.
Issues
of Identity, Content Control, and Privacy
One argument
against eliminating central mail services is based on the notion of
institutional identity. Many people at Cornell like to see the Cornell
address attached to their mail notes. Clearly User@Cornell.edu gives
a greater sense of pride and identity to a great institution than User@FreeBeeMail.com.
To a certain degree, this can be placated for inbound messages by the
support of a Cornell gateway that would forward Cornell e-mail addressed
to registered external mail services. This is precisely what Cornell
and other institutions do for their alumni. Although alumni's actual
mail service might be on FreeBeeMail.com--they can place a Cornell e-mail
address on a business card, etc--anyone using this address will have
that mail forwarded to the actual non-Cornell mail server. Theoretically
Cornell can base its on-campus mail systems on the same principle. All
members of Cornell could set up accounts outside of Cornell and use
a Cornell mail gateway to ensure that the Cornell name is associated
with our communication. Such a system may have greater utility than
the current central Cornell mail service suite, and it would certainly
cost far less money.
This process,
however, may not be as alluring once we take a critical look under the
proverbial covers. First, many if not all "free" e-mail services litter
your mail with advertisements and obligatory postscripts that have nothing
to do with Cornell. Even though a user might send out an e-mail note
from FreeBeeMail.com with a preferred return address of users@cornell.com--the
chaff embedded by the service provider would make a less-than-elegant
mail note. For example:
Dear
Prospective Student,
Enclosed
is an electronic copy of the information you requested about Cornell
University.
Shop
at Tinkletoes.com for the best in summer footwear!
In addition,
many "free" mail services will set attachment size, storage, and inactive
account policies that are tuned for their business model--rules that
are often out of sync with Cornell's needs.
Of more
insidious concern over relegating Cornell e-mail to flexible and free
external services are the issues of content control and privacy. Owners
of these mail services have the means and often the right to access
the mail stored on their servers. Confidential university information
would be relegated to external agencies whose mission and values may
not line up with Cornell's. Cornell should think long and hard before
allowing any ad hoc consortium of e-mail providers to control the access
to its exchange of private, intellectual, and, often, confidential information.
Alternatives
So where
does this leave Cornell? Today we support a central mail system at a
significant cost to provide service of less value to many than
they can find elsewhere. Antithetically, Cornell values the ability
to control a critical service, its good name, and privacy--needs that
are clearly degraded by external mail services. Therefore, most would
agree that the current trend is not optimal and that Cornell must systematically
address its current state of mail services. Failure to do so will only
exacerbate the problem. More users will seek external mail services,
enticed by their flexibility, and, at the same time, Cornell will continue
to lose control over the quality, privacy, and look of communications
written by members of its community.
The most
obvious solution is to retool the current central-mail-service suite.
This retooling would focus on shifting centrally delivered Cornell mail
services from a distributed mail store to a server-centric storage model.
This is enabled easily as the current central Cornell mail server applications
providing POP services also support IMAP. Another alternative would
be to replace central Cyrusbased mail servers with Microsoft's Exchange
(7). Although enabling IMAP/server-centric mail services
is technically trivial, other issues may govern the ability to enable
this enhanced service.
These issues
include
- Security
- Spam/Bulk
Mail & Virus filtering
- Cost
Security
Cornell
currently depends on Kerberos to provide encrypted authentication services.
Unfortunately, the mail clients capable of taking advantage of Kerberized
IMAP connections are quite limited. Support for a larger suite of clients
would require Sidecar modifications or support for SSL (secure sockets
layer; web technology used to encrypt data). Simply stated, although
you could use IMAP today, most login transactions will be via clear
text passwords.
Spam/Bulk Mail & Virus Filtering
Few users
of Cornell's mail systems are unaware of the expediential increase in
unwanted e-mail. Somewhat unscrupulous but entrepreneurial companies
or individuals ply the Cornell community with endless messages hocking
everything from advanced college degrees to pornography. This intrusion
into our workspace, combined with mail messages that spread computer
viruses, is an opportunity to explore new filtering services.
If Cornell could provided central e-mail users "tools" that they could
elect to use to filter unwanted mail and flag messages containing viruses,
these value-added features may well incite many to take advantage of
central offerings.
Cost
In addition
to the resources required to provide Kerberized/SSL support for IMAP,
IMAP services may place greater input/output (I/O) and storage demands
on our mail servers. As soon as we encourage central mail storage,we
will need more storage hardware. The additional transfer of "sent" copies
from the desktop will place additional I/O on the systems and the larger
storage architecture attached to the mail servers may need retooling,
not to mention rethinking system memory allocations (8).
In addition, with the advent of a server-centered mail systems, disk
quotas would need to be adjusted and users educated as needed. The current
process of simply deleting old mail after 60 days would need to be replaced
by a process of notifying users about excessive disk use. Luckily, many
other universities have already addressed these issues, so the utilities
required and the information on system right-sizing is readily available
for baseline reference.
Issues
Surrounding Central vs. Distributed Department-Owned Mail Services
Many Cornell
departments elect to deliver their own mail service. This decision is
often driven by the desire for local control and for mail services superior
to those delivered centrally. It is difficult to argue against the continuation
of this service model. But if we assume a broadening suite of central
mail services, there may be compelling reasons for departments to elect
to use a central alternative.
First,
the central system is paid for by all members of the Cornell community,
regardless of use, and there are clear economies of scale for supporting
one system as opposed to supporting many spread about campus. In short,
the cost difference for supporting mail for 20,000 users is not that
different from supporting mail for 30,000 users. If central systems
are maintained, department systems are arguably redundant services that
waste money. If central services improve, these economic incentives
alone might be enough to encourage more departments to leverage the
service.
Another
intriguing argument for central mail systems is based on user privacy.
When mail is stored on multiple department mail servers, with no clear
policy in place for the rights of those who store mail in these systems,
misunderstanding about users' rights over the control of their mail
becomes more likely. At some universities users feel more comfortable
with a neutral central organization, under the strong guidance
of university policy, holding the only keys to the mail system.
For example, a faculty member may feel more comfortable knowing her
electronic mail is on a system that cannot be accessed by the dean of
the school.
Policy
Ramifications
Although
arguments can be made about the economies and general personal comfort
a centrally delivered mail service might afford Cornell, any mail deployment
topology at Cornell should be in sync with a universally accepted privacy
policy. Cornell's user community has the right to know how their mail
messages can be accessed without their permission. This is not a subject
that can be left to individual department interpretation, because department
servers will often store information delivered from faculty, staff,
and students outside of the department's domain. Only a campus-wide
policy on privacy can address this issue.
Focus
Points
There
are several opportunities outlined above that should be focused on when
pondering the continued support of central mail services at Cornell:
- Reaffirm
the importance of Cornellbased mail services in light of "free" alternatives.
- Retool
central mail systems to a server-centric model to make campus services
competitive with external alternatives.
- Retool
central mail systems to support mobile computing paradigms better.
- Market
the economic and privacy value of central mail services.
- Review
and forge policy as required to protect user privacy within a distributed
or central mail-service paradigm.
- Explore
spam/bulk mail and virus filtering services.
- Consider
the ramification of emergency notification systems based on ad-hoc
distributed mail services.
Closing
Thoughts
For years
mail services have been seen as a core product for almost every central
information technology unit in higher education. Today, this notion
is challenged at Cornell by the availability of free and advanced external
mail services. In addition, Cornell's history of supporting only the
POP protocol and automatically deleting stored messages has encouraged
local departments to engineer unique solutions or solutions external
to Cornell. In turn, unless Cornell defines a central mail system that
meets its own demands, any investment to simply "polish" the
old mail paradigm will simply be a waste of money.
Nevertheless,
few may argue that the current trend toward external services
is in Cornell's best interest long-term. This trend, over time, will
degrade Cornell's control of the content, quality, and "look and
feel" of electronic mail communication.
To address
these concerns, several long-term directions can be explored, which
include
- Eliminating
central mail services for all but a defined group of key administrative
processes.
- Retooling
the central system to support a server-centric model with advanced
services.
- Moving
to a distributed, departmentally governed service overseen by a common
privacy and use policy.
Of course,
change is not pain-free. Retooling central systems will require funding,
and all of the alternatives require ample lead-time to implement. Regardless,
failure to act is in itself a decision that will result in at least
a poor use of university funds; more likely is the loss of control over
a resource as fundamental to communication at Cornell as speech.
Endnotes
(1)
http://asg.web.cmu.edu/cyrus/
(2)
http://www.faqs.org/rfcs/rfc821.html
(3)
http://www.faqs.org/rfcs/rfc2045.html
(4)
See http://www.faqs.org/rfcs/rfc1725.html
(5)
See: http://www.imap.org/imap.vs.pop.html
(6)
See: http://www.cit.cornell.edu/oit/Arch-Init/wireless.html
(7)
Any large-scale move to Exchange would require extended technical review.
At this time, few higher education institutions with mail volumes equivalent
to Cornell's have based their central mail systems on Microsoft's product.
As such, extensive load testing would be required.
(8)
It is beyond the scope of this paper to outline in detail the impact
IMAP may have on servers. To some degree, increased I/O will be offset
by "header only" downloads of messages stored on mail servers. Unlike
the standard POP use at Cornell, message content is not downloaded unless
that message is read. Because many users receive mail that they never
read, these messages will not increase I/O load.
Return
to Papers Page