Data Management
First Monday

Data Management: The Foundation for Web Development in the Retail and Service Industries

The emerging knowledge society will be based on the trading of information. To achieve this we must collect data and collate information then use it intelligently to create knowledge. The Internet is potentially a huge warehouse of data. This volume of data is changing and growing at an exponential rate and should be stored and managed by database management systems. The accurate maintenance of the data is of utmost importance and cannot be left to uncontrolled processes. The means of extracting and delivering the information from this warehouse of data must be automatons, not people, as people cannot move at the speed of bits.

Managing the data content of a commercial Web site is a major undertaking. Expanding a Web site to personalize the content provided to individual visitors, thus creating unique individual interactions, has been frustratingly elusive. Web site developers must use database technology to manage both the site content and the client information to enable profitable global scale retail and service activities. Web databases have the potential to deliver personalized content to each individual in response to the individual's Web interactions. To achieve this the individuals Web interactions must be recorded over time using database management systems. Web automatons will be required to vary the content delivered according to these interactions.


The Rise of Web Communication
The Current Status
Web Database Design
Web Supply Database
Client Database

The Rise of Web Communication

Internet technologies are well recognized for their ability to connect individuals to others. This ability to connect has encouraged the growth in the size and scope of personal communication networks. An April, 1999 estimate indicated 159 million people online (Nua Internet Surveys). International Data Corporation (IDC) reported in March, 1999 that approximately 200 million people will access the Web this year, growing to 500 million by 2003 (IDC Directions Conference).

The management of individual communications is a major issue. E-service is a term used for the processing of communications from clients that do not represent actual commercial transactions. At present, these sorts of queries are often left unanswered. The research firm Jupiter Communications released a report in November, 1998 which found that of 125 major Web sites surveyed, 42 percent did not respond to customer inquiries; or, took more than five days to reply; or, had no method of receiving inquiries by e-mail (Duvall, 1999).

Theses communications are very important to the process of building trust relationships with clients and to decision making inherent in purchasing goods or services. One-to-one communication is a very powerful method for the transferring ideas, opinions or information. Interpersonal communication plays a major role in the process of diffusion of ideas and innovation and it has more dramatic effects than other forms of communication (Littlejohn, 1995).

The quantum leap will be the ability to exponentially multiply one communication from a single source into a multitude of individual targeted communications to individual recipients. This is not merely a faster version of current mass media; instead this targeted communication is personalized to each receiver, in the form of a personal reply to a query or a personal greeting.

The Internet has the potential to integrate personal communication messages with mass media-like distribution. Potentially enormous gains in productivity are possible by harnessing the power of the computer to provide individual communication without requiring actual individuals to produce every unique message.

If the simulation of one-to-one communications using the medium of the Internet is accomplished, then the medium will indeed be the message.

The Current Status

In a knowledge society the most important resource is information. Currently the intervention of people slows the distribution from the speed of bits to the speed of people. Successful Web retailers such as continue to fail to make a profit. They started with the promises of low overheads, no physical infrastructure, and small staff, yet now has a staff of over 2,100 people servicing their Web clients. Over 2,100 people, working in physical warehouse filling orders, is a large investment in bricks and mortar. This operation resembles a mail order company, with the digital difference in the form of catalogues and orders (Grover, 1998). Every successful communication between and a client requires human Amazon employees to process the order.

A Web bookstore in a knowledge society should be able to remove the people filling the orders; the books should be delivered digitally, to be displayed on tablets or printed at the point of delivery. A book delivered in bits could be customized for every client according to the known preferences of the client in a manner similar to the different "cuts" of movies available now. The distribution chain is wasted delivering content that could be delivered electronically. Electronic delivery of intellectual property assets using the medium of the Internet is the future of the knowledge society.

The music industry is currently involved in the revolution brought about by the move to electronic distribution of their product as bits in the form of MP3 files. The issues that have arisen include the ownership of the copyright, payment and piracy. The industry is currently developing standards that will enable electronic distribution of music and video as bits through the medium of the Internet without compromising the ability of the producers and creators of the electronic property to gain a profit from their work (Koenen, 1999). Once the standards have been developed they will provide a prototype for other types of intellectual property to be delivered as electronic bits.

To enable customization for individual clients very large quantities of data will be required. Web sites will need to collect and store many small pieces of information, both the original messages that form the content of the site, and many details about individual clients who communicate with the site. The information about the individuals will be both that supplied purposefully by the individual and that recorded as the history of the individual interaction with the Web site. The data collected will replicate the information each person gains through personal contact between two individuals.

Privacy is certainly the largest issue on the table. Privacy worries 88% of consumers, according to a Louis Harris & Associates poll conducted in December, 1998. Currently, laws vary from country to country regarding the collection and disclosure of personal information, although there is a move towards standardization of the laws across national boundaries. The European Union Directive on Data Protection allows European Union countries to block the transfer of personal information to countries that do not offer adequate protection. This Directive has prompted the U.S. government to release data privacy guidelines (Mills, 1999).

The information about the client and the site content will then be correlated and translated by intelligent software into knowledge used to customize products, and deliver them to the client. To manage these huge quantities of data specialized databases and intelligent software will be necessary.

Web Database Design

In the creation of a database foundation for Web-based systems, developers should take note of the lessons learned in older computing environments. They should transfer knowledge from database theory and experiences gained in dealing with issues of shared and persistent data management in other computing environments (Peckham, 1999).

Computers can store messages in queues waiting for distribution into specially designed databases. This message component database can be used to assemble customized individual messages, thanks to profiles for each individual client. Correlated data about the individual from both the supplied input and behavior histories can then be used to create personal profiles. These profiles will be used to generate individual messages for each client, essentially mass customization.

Web content cannot be published once and then forgotten. It is important to refresh and update information regularly. It is also important to maintain diverse content to support a wide range of client needs. In a changing environment, items quickly become outdated; it is imperative that online material is regularly reviewed to keep it up to date and consistent.

To maintain Web content, an appropriate place to start looking for assistance is the software development industry itself. For many years the software industry has been managing the development and change process for large projects, that is the components of information systems. The most effective methods found to date to manage this process have been embodied in the version controlled source and document management systems commonly used in software development. These management systems should be used to manage not only programs running on a Web site, but also the content of the site itself.

Software management systems use database technology to automatically manage the revisions of items, track changes and provide coordination mechanisms for multiple developers working in one area. There are facilities to check items out to developers for review and management controls to identify the status and revision level of items. The repository database stores completed items including previous revisions of the items. There are controls to synchronize changes that affect multiple items. Management systems are available to manage diverse collections of text, images, sound and video.

A Web site could use data stored in a management system as content. The site could automatically use the appropriate copy of each item stored in the management system, providing each individual client with customized information.

Web Supply Database

Currently there are two approaches to deriving information about items for sale on a Web site. One method involves coding each item individually onto a Web page. This approach is very limited, because every time a change occurs, the page code must be altered. Every change must be tested for errors before being replaced on the site, adding up to a very time consuming process.

A second approach is to use a product inventory control database to store product details. This approach enables online updating on the availability and price of items as they are sold, in the same manner as store computer systems are updated by real-time sales. This approach facilitates inventory management in the online store, always providing up-to-date information to the client.

When the product itself is electronic bits, the product database will contain the actual products for distribution.

The database enables full searches for items using not only keywords but also any data stored in the database, even keywords within product descriptions. It enables selected items to be sorted into different orders to facilitate the clients' demands for personalized distribution.

Client Database

For a business to be effective it is most important to gain detailed knowledge about clients. In the past business owners and customers met face to face in the same physical space and through this interaction they developed knowledge about each other. Today many businesses are actively seeking to gain a similar knowledge of their clients by the use of loyalty schemes and reward systems to track customer details and purchasing habits. In exchange for this information the client receives discounts and prizes.

Collecting data from Web clients is very difficult. The Web is often acknowledged as an anonymous medium. Web clients are reluctant to divulge personal information about themselves as this destroys the anonymity inherent in the Web experience. There is certainly some irritation if obstructions to surfing take the form of long questionnaires.

Many Web sites use cookies to monitor client movements and surfing habits. Cookies provide information about repeat visitors, and other sites the surfer may have visited. Cookies are not foolproof as different people may use the same computer and Web account for online access or individuals may use more than one computer and more than one Web account. Clients may delete cookies stored on a machine. Attempts to allocate unique identifiers to machines and clients have met stiff opposition from privacy groups, such as in the recent attempt to locate a Microsoft Windows identifier in the Intel Pentium III chip set.

Transaction related client information is routinely stored in a database when an online sale is processed. This information is directly related to the process of the sale and delivery mechanism.

Web-based forms, filled in by clients, are more direct in collecting data. Usually the form includes rewards, such as free service or further information to be delivered to the client. This reward is used to gather information about the client, such as e-mail address and name; purpose in visiting the site; usage; and, opinions. Big rewards demand greater details. A heuristic in this situation is to restrict the information required at the first contact to less than a page; in addition, the information sought should be impersonal and publicly available. More personal information should only be supplied once the client has established a trust relationship with a given Web site.

Collated data for each client has considerable value as a commodity in its own right (Baig et al., 1999) Statistical information can be used to secure advertising revenue for a given site and may be utilized in many ways accruing major benefits to the organization. Scott Blum, founder of, sees retail below cost on the Web as a vehicle to secure traffic, which in turn generates income from advertising (Grover, 1998).

Client data is already recognized for its ability to identify targets for marketing initiatives. However its value as a means to customize every interaction with the client automatically has not yet been realized.


If the purpose of a Web site is to collect and distribute information, doing this intelligently is the key to realizing the potential of our current knowledge economy. Using automated mechanisms to manage the storage of information and translate it into individually customized messages, removing the human in the middle, is the key to realizing this potential.

It is essential to apply the knowledge of database theory and the experience of data management gained in other computing disciplines to develop Web-based systems.

On a Web site, data can be found in a content database, product database and client database. These combine to form a formidable combination of knowledge when used to personalize a Web experience. With this information, it is then possible to identify the client as they enter a given site, with content provided according to the client's preferences based on a specific client profile. Supplied content will reflect preferences in language, layout, subject matter, surfing habits and previous interactions. Personalized greetings can be included, status information on current transactions displayed and personalized advertising messages targeted. This intelligent use of databases will provide a foundation to support personal communications with every visitor at any time.

About the Author

Susan Chard is a Master of Communications student at Victoria University of Wellington, New Zealand.


C. Baig, M. Stepnik and N. Gross, 1999. "The Internet wants your personal info. Whats in it for you?," Business Week (April 5).

C. Bayers, 1998. "The Promise of One To One (A Love Story)," at

J. Berst, 1999. "How Eservice Could Put You Out of Business," at

Coalition for Advertising Supported Information and Entertainment (CASIE), "Guiding Principles of Interactive Media Audience Measurement," at

M. Duvall, 1999. "Customer Service By E-Mail: A Failing Point,",4164,387391,00.html

L.V. Gertsner, 1998. "Cebit '98 Keynote address," (Hanover, Germany; March 18), at

M.B. Grover, 1998. "Lost in cyberspace," at

R. Koenen, 1999. "MPEG-4 Multimedia for our time," IEEE Spectrum, volume 36, number 2 (February), pp. 26-33, and at

C. Krol, 1999. "Consumers reach boiling point over privacy issues," Advertising Age, (March 29).

S. Littlejohn, 1996. Theories of Human Communication. 5th ed. Belmont, Calif.: Wadsworth.

O. Malik, 1999. "eCommerce service with a smile,"

K. Messner, 1999. "Turning eChaos into eCommerce," Upside (January).

E. Mills, 1999. "Government releases privacy guidelines for Web development sites that collect information from European Union residents," IDG News Service (April 20).

W.J. Mitchell, 1995. City of Bits: Space, Place and the Infobahn. Cambridge, Mass.: MIT Press.

Nua Internet Surveys, 1999.

J. Peckham, 1999. "Data for the Masses," Journal of Database Management (April-June).

W.H. Weiss, 1998. "Internet popularity and use continues, Supervision, volume 59, number 1 (January), pp. 3-6.

Contents Index

Copyright © 1999, First Monday

Data Management: The Foundation for Web Development in the Retail and Service Industries by Susan M. Chard

A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2019. ISSN 1396-0466.