Mature, Portable, Data-Driven Systems

       Introduction 

       What is Data-Driven?

              The Language

              The API Library

              The Data-Dictionary

              The Design Tools


       Why is Data-Driven Better?

       The PROs of Data-Driven

       The CONs of Data-Driven

       Is Data-Driven the Right Choice for You?

       What about Portability?

       The Open-Archicture Concept

       An Array-Based Architecture

       Data-Driven Data-Driven Systems

       Meta-Data Storage Techniques

             Using Databases

             Using Repositories

             Using Text Files

             Using Array Files

             Using Token Strings

             Using Linked List Strings

             Using Array Fields

             Storing Arrays to Code


       Performance Issues

       Multi-User Issues

       Data-Integrity

       Memory Management

       Documentation

       CONCLUSION

Introduction

The concept of "Data-Driven" applications has been around for many years. Some languages are ideal for writing Data-Driven applications, others are not. CA-Clipper has always been a better-than-average language for this kind of application, but early releases just couldn't deliver enough power to handle the job for many large, mission-critical applications. Many application developers are now finding that the evolution of the CA-Clipper language, operating systems, and computer hardware in recent years has given rise to looking at Data-Driven much more seriously than ever before. Take the introduction of multi-dimensional arrays, lexical scoping, virtual memory-management, protected mode linking, replaceable data-drivers, and basically a rock-solid release in CA-Clipper 5.2e. Add to this the improvements in speed and reliability of the DOS operating system and WIN-95 or OS/2 DOS Windows. Throw in the advanced hardware technology with huge amounts of disk space, memory and super-speed 486/586 processors, and we now have the ingredients for building applications a better way - "Data-Driven".

This seminar is designed to help the CA-Clipper/VO programmer understand the concepts of Data-Driven applications and also to present a methodology for designing a Data-Dictionary that will support applications under multiple platforms, such as CA-Clipper/DOS, CA-Clipper/UNIX (via Flagship), CA-Clipper/WINDOWS (via FiveWin or Clip4Win) or CA-Visual Objects. This seminar is not intended to discuss Data-Dictionary theory. If this is what you want, then refer to the bibliography for a list of reading material. After discussing Data-Driven systems with other developers, I have found that nearly every CA-Clipper programmer has dabbled in efforts to create Data-Driven applications. I suspect that many of you already have a good sense of the terminology and the basics concepts but are
(1) still not programming this way or
(2) are programming this way but may be looking for some tips to improve on your design, or
(3) have trieda Data-Driven design on one or more occasions but were burned by the experience.

The presented code samples and most of the discussion are based on a procedural and "modal" model of an application designed around the capabilities of CA-Clipper/DOS, however, nearly all of these techniques, including the design of the Data-Dictionary tables can also apply to event-driven, non-modal systems built with CA-Clipper or CA-Visual Objects.

What Is Data-Driven?

Data-Driven applications are those in which the fundamental, functional behavior of the application is defined in data-tables rather than in code. For example, the definition of a menu-system, a data-entry system, a browse configuration, file and field specifications, and dialogue screens, can be stored in databases and retrieved during the running of the application to drive the application's basic functions. The elements of a Data-Driven application, i.e. the resources are usually external to the program (the hard-code) and are usually stored in database (*.DBF) files, text files and/or array files.

A good, well thought-out, Data-Driven system is "normalized" and "modular". It consists of separate sub-systems that allow the designer to break down the basic elements of an application into logical sub-systems that can then be easily hooked together via menus, controls, push-buttons, validation systems, commands, hot-keys, mouse-events, etc. There are many ways to implement a Data-Driven architecture in a CA-Clipper or a CA-Visual Objects application. A Data-Driven system may be Active, Passive, Template-Based, Array-Based, Object-Oriented, Repository-Based, Code-Generator, Living Application or a hybrid that uses some or all of these techniques. It may be closed-architecture or open-architecture. The developer may choose to purchase a ready-to-go system from another developer or create his/her own system from scratch.

For the purposes of this seminar, I am going to focus on the techniques that I have implemented into Data-Driven systems that I am currently supporting. A discussion of all the different kinds of Data-Driven systems would take much more time than we have available and would be more of a "history" seminar than a "how-to" seminar. Code samples, database definitions and design considerations are all based on our product, dCLIP, therefore I will make reference to functions and file names that start with "DC". The system I will be describing is basically an active, array-driven, living application, open-architecture system. I intend to present an architecture and a methodology that help the developer create supportable, data-driven applications which will meet high standards of speed and integrity but still allow complete flexibility of design.

Given enough time and a little imagination, just about any sub-system in an application can be Data-Driven, however the most popular candidates for data-driven resources are menus, data-entry screens, browses, queries, reports, graphs, dialogue screens, and database data-dictionaries. It is not practical, in this seminar, to discuss how to design any specific data-driven sub-system, such as a menu or browse system. What is practical however, is to discuss the issues that apply to all Data-Driven sub-systems, such as architecture, memory, files, integrity, performance, compatability and portability. Data-Driven systems require a robust language, an API library, a set of design tools, and a well thought-out methodology (the Data-Dictionary).

The Language

CA-Clipper and CA-Visual Objects are ideal languages for Data-Driven applications because they provide an excellent macro-compiler, an expression language, replaceable data-drivers, code-blocks, multi-dimensional arrays, and an open-architecture. These are necessary ingredients in creating a robust Data-Driven application. Other important requirements of the language are a good error-handling system, a memory management system, a powerful database management library and runtime reliability. Probably, the single most important ingredient in the language is an "open-architecture", because without it, any Data-Driven application development system will soon hit the wall in its capability and flexibility.

The Application Interface (API) Library

It can often take years to develop a library of routines that are robust enough to "Data-Drive" an application. Whether applications are Data-Driven or not, most developers create their own custom library of reusable code and use the same functions again and again in the same application or in different applications. This "modularity" concept is necessary to produce applications that are maintainable and cost-effective. Over the years, functions in the library evolve to handle the different requirements in the application code by passing each function a set of parameters defining it's behavior.

In a NON-Data-Driven system, this library is designed to be used in applications that make calls to the functions directly in the "compiled code". If a system requires a special menu or a change to an existing menu, then custom code must be written, compiled, and linked into the executable program. A well designed application library can improve the process by creating a system that allows design changes with a minimal amount of recoding, but if it isn't Data-Driven, there is ALWAYS some recoding, compiling and linking required. Compiled applications are "early-bound" and usually offer the best performance and integrity, but they can often be very painful to maintain or evolve. In a well-designed Data-Driven system, the library is designed to give the developer the option of deciding which portions of the application will be committed to code (early-bound) and which portions of the application will be Data-Driven (late-bound). Itis logical to assume that the portions of an application that should be Data-Driven are those that are constantly changing, require prototyping, or must be written quickly and reliably. Anapplication library designed around the needs of Data-Driven applications will contain functions that will store and retrieve data-dictionary tables and use data-dictionary information to drive the application in the most reliable and efficient manner possible. Application libraries are usually written in the base language, i.e. CA-Clipper, with minimal coding in alternate languages, such as C or Assembly.

Some Data-Driven systems developers prefer to call their system an "engine" rather than an API library. This is because it is usually easier to understand and describe how an engine works than to describe 600 separate library functions. A Data-Driven engine is basically a collection of all the API functions linked into an executable program and organized into a set of menus or a methodology that will expedite the development of an application or allow the execution of an application. Systems that allow customization of the engine to include user-defined libraries, replaceable data-drivers and third-party libraries are "open-architecture" systems. The ultimate open-architecture systems are those that even supply the source code for the API library

The Data-Dictionary

Over the years, the term "Data-Dictionary" has referred to any system that is a collection of data about your system data. It is a listing of database files, indexes, relations, fields and validation rules. Sometimes referred to as metadata, it is an industry term that describes the databases or tables that have no direct relationship to the business data (the application data). Some Data-Dictionaries will not only include information about how data is handled, but also about the general behavior of the application, such as menu structure, dialogue screens, keyboard and mouse behavior, data-entry and browse configurations, and reports.

Probably the most popular understanding of a Data-Dictionary is that it provides a defined, formalized place to record, develop and hone your assumptions about the clients data. Sometimes the Data-Dictionary is nothing but a documentation standard and provides nothing at all in the way of a design methodology. This seminar does not deal in any way with the concept of documentation, but instead talks about the Data-Dictionary as real data that is used by a set of design tools and by the application to "drive" the resources of an application.

For the purposes of this seminar, we will refer to each resource file as a "Dictionary" file. For example, the database that contains all the menus in a Data-Driven system may be referred to as the "Menu Dictionary" and the database that contains all the information about files, indexes and relations may be referred to as the "File Dictionary". The more sub-systems that a Data-Driven methodology employs in the design, the more robust the application. For example, the dCLIP system supports the following Data-Dictionary files.

DCFILES.DBF - Used to store "FILE GROUPS" consisting of information about databases, data-drivers, indexes and relations that all make up a related group of databases. Each File Group has a Unique "Tag Name" that can be passed to functions in the application library to restore (open) all the files before operations such as data-entry, browsing, or reports.

DCFIELDS.DBF - Used to store field attributes, default validation formulas, picture clauses, descriptors, encryption codes, help codes, etc. for all fields in all databases defined in the application. Each field group is given a unique name matching the ALIAS of its respective database.

DCMENU.DBF - Used to store complete top-bar with pull-down menu systems or sub-menu systems to be used for driving an application. Each Menu has a Unique "Tag Name" that can be passed to functions in the application library for attaching a menu to another menu, to a data-entry screen, to a browse screen, a dialog screen, etc.

DCEDIT.DBF - Used to store editing configurations for data-entry. A configuration can consist of real and/or virtual field placements anywhere within an editing window. Fields may be merged from Parent and Child databases. Field descriptors, screen objects, colors, pick-list options, menu and key tag references, page layout, etc. are all stored in this dictionary for restoring a data-entry screen.

DCBROWSE.DBF - Used to store browsing configurations for multi-window browsing. A configuration can consist of real and/or virtual field columns. Field columns may be merged from Parent and Child databases or configuration may be multiple browse windows consisting of one-to-many relational browses. Column headers, colors, pick-list options, menu and key tag references, scoping options, column totalling options, printing options, etc. are all stored in this dictionary for restoring a browse screen.

DCKEY.DBF - Used to store keyboard "KEY GROUP" definitions. Any key on the keyboard can be temporarily redefined to call a code block, stuff a string, disable other keys. Key Groups are given "Tag Names" so they can be passed to other functions like data-entry, browse and dialogue screens.

DCUSER.DBF - Used to store log-on information, passwords, and access keys to menus, databases, fields, etc.

DCSCRN.DBF - Used to store Screen objects, such as boxes, backgrounds, text, modal dialogues consisting of Gets, radio-buttons, check-boxes, push-buttons, etc.

DCCODES.DBF - Used to store Code tables for data-entry validations.

DCLOCKS.DBF - Used to store definitions of locks on menu items, databases, fields.

DCFMAP.DBF - Used to store database "MAP GROUPS" for mapping fields between groups of databases. Maps can be used for transferring data from child fields into parent fields after a validation lookup or for filling in default information when appending a new record.

DCPROG.DBF - Used to store "interpreted" programs. Each program can be executed from the interpreter or written to .PRG files for compiling and linking into the executable program. Interpreted programs are written in the base language, i.e. CA-Clipper, and follow the same basic programming rules as compiled code except that LOCALS and STATICS are not supported. Interpreted code can be used to handle special customizing of validations, input screens, reports, etc. that may need to be modified often or prototyped during the development of the program or to handle special custom requirements in the field without requiring a change to the basic executable program.

DCHELP.DBF - Used to store general help and context-specific help screens for a hyper-text style help system.

DCPRINT.DBF - Used to store printer driver data.

DCGATE.DBF - Used to store DOS Gateway information for calling other programs.

DCQUERY.DBF - Used to store Query expressions created by the Query Builder for access by the program.

DCREPORT.DBF - Used to store information about system reports such as pointers to .LBL and .FRM files, pointers to external report drivers such as R&R, or internal report systems such as Bandit.

The Design Tools

Most Data-Driven systems are only as good as the design tools, or "editors" that are used to create the application. For example, it is possible to create a Data-Driven system that will support powerful data-entry screens, menus, browse-screens and dialog screens, but not contain a single design tool. In such a system, the Data-Dictionary files may be .DBFs or text files, and designing the application would require using a standard database browser/editor or a text editor to create or maintain the Data Dictionaries. The task of designing a menu may consist of :
1. Starting up a database browser utility,
2. Opening the MENU.DBF database and it's associated index file(s).
3. Browsing the database.
4. Adding a record in the database for each item in the menu.
5. Taking care to ensure that the data in each field is entered properly.
6. Deleting a record and adding a new record if an item needs to be moved.
7. Saving the database changes.
8. Quitting the Database Browser program.
9. Running the application to test the menu
10. Repeating 1-9 again and again until the menu is completed.

It soon becomes painfully obvious that without a set of design tools it can take just as long, or even longer to develop a Data-Driven application as it can to develop a hard-coded application. Design tools, sometimes referred to as "resource editors" can make life much easier for the application designer. Such tools provide a quick means of adding and editing items in a menu, a data-entry screen, a dialogue screen, etc. in a true WYSIWIG fashion. They will employ drag and drop features using the mouse and/or keyboard to expedite the process of design and will be "active" rather than "passive".

A passive system is usually one in which the design tool is contained in a program that is external to the application that is being designed. Passive design tools are usually cumbersome to use, do not give the designer an accurate representation of the actual application resource that is being designed, and write "intermediate" data or code that must then be converted in some manner to integrate within the application. For example, using a Windows program to design a menu that will be inserted into a CA-Clipper/DOS application is not WYSIWIG.

The most productive design tools are those that are written to be integrated into an active, "live application". Such tools become part of the Application (API) library and can easily be linked into an application engine to allow the designer to build and test the application "on-the-fly". An open-architecture system will give the designer the option of including or excluding the design tools when the application is delivered to the customer.

Why Is Data-Driven Better?

It isn't always better. For example, a very specialized database application such as an Appointment Scheduler" system that requires complicated display screens would NOT be a likely candidate for Data-Driven. The basic model for scheduling of appointments varies only slightly from business to business, so a well-written, hard-coded system will very likely perform well for most business requirements. On the other hand, a "Contact Manager" is an ideal application for Data-Driven techniques because it must almost always be specially tailored to meet the needs of the user. Nearly every business handles contacts differently and requires fields, validations, data-entry requirements, reports, etc. that are suited for that business. This is where Data-Driven concepts have proven themselves to be invaluable when developing custom applications. The pace of technology is forcing corporations to be much more flexible in the handling of data than ever before. A new federal or state law, new market opportuntiies, competitive pressures, etc. can often require that an application be modified to handle a variety of new field and file definitions, data-entry requirements and reports with a rapid turnaround. It is becoming more and more impractical to make these changes in code. A powerful Data-Driven methodology will reduce the time it takes to write an application as well as making it more flexible for the user and easier to maintain

The PROS of Data-Driven

* Less development and debugging time than writing custom code for each application.
* Less risk of de-stabilizing the application and decreasing reliability than writing custom code for each application.
* Less skill required on the part of the application design personnel. They need only to know the requirements of the application, not the intricacies of the language.
* Language translations are much simpler and less costly.
* Simpler version control systems.
* Provides a single point for source code modifications by gathering the source into databases that are organized into a logical system.
* Reduces code size of the Executable program by removing hard-coded information.
* Eliminates or reduces the need to re-compile and link the executable program when making modifications to the application.
* Provides end-users with flexibility and customization.
* Increases dependability, referential integrity and modularity.
* Applications can be ported to other platforms much quicker than hard-coded applications.
* Team programming is easier to manage.
* Developers can support many more custom applications and configurations with a much smaller programming staff.

The CONS of Data-Driven

Data-Driven systems do not eliminate the need for a highly-skilled programmer. Contrarily, developing the application library for a robust Data-Driven system can usually take much longer than writing a custom application, and also requires a more skilled programmer than those required to write a custom application. If you are developing a Data-Driven methodology from scratch, or even if you intend to purchase a system, there is no way to eliminate the need for a highly-skilled programmer. The API library must be developed and maintained and must constantly evolve to meet new requirements for the application. Data-Driven systems do not necessarily make life easier for programmers, but then, that is not the intention. If you are a programmer who is looking for an easier job, then don't take on the task of developing a Data-Driven API library. You should instead, consider purchasing a complete system that you can learn and maintain.

* Requires longer development cycle to create a Data-Driven API library
* Requires higher-skilled programmers to develop and maintain a Data-Driven API library.
* Data-Driven applications, unless the API library is well desgined, are usually slower in performance than hard-coded applications.
* Data-Driven applications usually require more system resources than hard-coded applications, such as memory, disk-space, network traffic, etc.
* Data-Driven systems are usually more complicated to distribute because they use many more files than hard-coded systems. Forget one file and the application may not run.

Is Data-Driven the Right Choice for You?

Here is a basic checklist that will help you determine if your development team and the applications upon which your business is built are a candidate for Data-Driven systems.

* How many software products or applications do you develop that have the same basic look and feel or serve similar client needs? For example, if your business supports several applications that all use xBASE (record-oriented) databases, rely heavily on data-entry, querying, browsing, and reporting, then they can be developed and maintained quicker if they are Data-Driven.

* Do your applications require regular redefinition of menu, file, field or data-entry requirements? If the application must be constantly adapted to changing needs, then you should consider a Data-Driven system.

* How much of the application revolves around the process of data-entry and queries such as lookups, searches, browses, and reports. If this percentage is high, then the application could be Data-Driven.

* How much of the application can be designed around a set of menus that call pre-defined functional modules in the library of routines supported by the "engine"? If this percentage is high, then consider Data-Driven.

* Will your applications need to be ported to other operating systems or languages? If the answer is yes, then Data-Driven can improve the portability.

* How powerful is the basic architecture of the "engine" that drives the Data-Dictionary system? If the engine is weak in its capabilities, then much of the application will need to be hard-coded. If the engine is strong and robust in its capabilities and architecture, then much of the application can be Data-Driven.

* Will your applications need to be supported under multiple languages? If yes, consider Data-Driven.

* Does your programming staff consist of one highly-skilled, advanced architect and programmer and several lesser-skilled, programmers or report designers? If you already have this type of development team or you can create this kind of development team, then your applications can be developed the Data-Driven way.

What about Portability?

Earlier, I stated that Data-Driven systems are much more portable than hard-coded systems. This is usually a true statement when referring to the application behavior that is defined in the Data Dictionary files but is not necessarily true when referring to the Application (API) library that is used to create the Data-Driven engine. The library must be written in the same language as the base language. If this language is not supported under the new platform, then the library must be re-written in a language supported by the new platform.

For example, when porting a CA-Clipper based Data-Driven application to UNIX, the process is not nearly as difficult as porting the application to Windows. There are third-party products (such as FlagShip) that can be used to compile the application API library for use on the UNIX operating system. This is possible due to the fact that the UNIX and DOS operating systems are very similar. Windows, on the other hand, is a completely new paradigm that requires rewriting the application API library in a Windows language. It is logical that you would want to choose a Windows language that closely resembles the CA-Clipper DOS language when converting the API library. Much of the code in a Data-Driven API library is not affected by the Windows paradigm because it does not affect screen output or user input. For example, the functions that open dictionary files, databases and indexes, save and restore configuration arrays, perform validations, ensure referential integrity, etc. can most likely be converted with little or no source-code modification. Probably the quickest method of converting a CA-Clipper Data-Driven API library to Windows is by using a third-party product such as FiveWin or Clip4Win. The conversion process consists of only rewriting the functions that interact with the screen and user input. Eventually, you will want to complete the conversion process by evolving the application to a more robust Windows development environment such as CA-Visual Objects. This will require a little more work because almost all the code will need to be modified to handle the subtle differences between the languages. Another way to ensure more portability of an application is to write the application library using "array-based" or "object-oriented" techniques. Most modern, high-level languages support the concept of multi-dimensional arrays, regardless of the programming paradigm. Arrays are data structures that can hold program information that are identical in design and structure regardless of the base language. An application that is designed around the manipulation and passing of arrays of information is usually much easier to convert than those which are designed around structural code. Arrays are objects. Objects are portable.

Probably the most important decision the designer can make when creating a Data-Driven system is the choice of the method of data-storage to disk. The plethora of new data-driver systems for CA-Clipper and CA-Visual objects give the designer lots of options for ease-of-design and performance, but if a non-standard storage system is adopted, the advantages of Data-Driven go out the window. See "Meta-Data Storage Techniques" for suggestions on how to choose a storage method.

The Open-Architecture Concept

Basically, most Data-Driven system maintain a "repository" of information about the application. The method in which this information is gathered and organized determines the maintainability of the application. Many application development systems are purposely "closed-architecture" systems. What I mean by closed-architecture is that the application engine and library are designed to "hand-hold" the developer through the process of developing an application to ensure that all the necessary parts of the application are properly created and linked together. After using a system like this is soon becomes obvious that the developer is limited to creating applications that must conform to a simple and basic template of functionality, performance, and user-interface otherwise the organization and the integrity of the Data-Dictionary may be compromised. These "repository-based" systems, while offering good organization of the application almost always limit the creativity of the developer.

It is becoming more popular to develop applications around Data-Dictionary systems that are "open-architecture". These development systems treat each individual sub-system or "resource" as a complete standalone system (with it's own Data-Dictionary) to be used in conjunction with other sub-systems at the discretion of the programmer/developer. They do not force the programmer to adopt the entire methodology, but instead allow him/her to use whichever Data-Driven concept best suits his/her needs. The disadvantage of some open-architecture systems, of course, is that they don't offer a convenient starting point when designing an application. These systems are based around an API library more than a methodology. Getting started with the creation of a Data-Dictionary for an application can be disconcerting in an open-architecture system that has very few design tools.

The very best application generator systems are those that can provide the best of both worlds, i.e., a convenient starting point, a methodology, a good set of design tools, and a variety of choices with a completely open architecture. Most times, when getting started with a new Data-Dictionary system, the developer wants to integrate data or code from an existing system with the new system. A Data-Driven system's API library should be designed to link with any existing system (of the same base language) thus giving the developer the option of moving portions of an existing application over to Data-Driven, or adding a new Data-Driven feature, without impacting the rest of the application. It should also contain "automatic-population" features that will help build a Data-Dictionary from information in the running application. For example, the File and Field Dictionaries can easily be created by scanning all the work areas and capturing the current file, index, field, tag, filter, and parent/child relation information to the Data-Dictionary.

Using an Array-Based Architecture

A Data-Driven architecture that is "Array-Based" offers a lot of flexibility in "tuning" the application and in providing alternatives for storing and retrieval of the Data-Dictionary data. In this kind of architecture, the determination of the behavior of the customizable portion of each sub-system is held in arrays. During the design of a Data-Entry screen, Browse-screen, Menu-system, Dialogue screen, etc., the array data is manipulated by the associated "editor" for that sub-system. The arrays must be designed for "persistence", meaning that they can be easily copied to a variety of storage systems, such as Databases, Memo fields, Text Files, Array Files and/or Source code for later retrieval. A persistent array will not contain code-blocks, because they cannot be saved and restored. The code-block problem can be easily resolved by storing a "character-string" equivalent in another array element, then macro-compiling and storing the compiled block to the required element after retreiving the array.

When an array is retrieved, it is then passed to the required sub-system for execution. For example, a DC_MENUEDIT() function would be a "menu-editor" system that is used to load and edit a menu array. A menu array can be created from data stored in the DCMENU.DBF dictionary, from a *.DCM menu-array file, or from a source-code function. This array may then be passed to a DC_MENURUN() function for executing the menu or saved to the Data-Dictionary.

Before an array is saved it should be assigned a unique "Tag Name" which is used as a reference for later retrieval. For example, after designing a main menu pull-down system you may assign it the tag name "MAINMENU". When saving this menu to the Menu Dictionary, each array element is saved as an individual record in the DCMENU.DBF database and grouped together by a common MENU_NAME field containing the "MAINMENU" tag name. When saved as an array file, the entire array is saved under one file named MAINMENU.DCM. When saved as source code, a *.PRG file is created that contains a function named MAINMENU_M(). This function, when called, will return the contents of the menu array.

The advantages of designing a system this way start becoming obvious especially when it's time to start building an application. For example, when designing a main menu it is necessary to test the menu often to make sure it looks and behaves properly. In an "active" system, the menu designer may be linked into the same engine that actually runs the application. The menu designer can load a menu array from the Menu Dictionary file, make the changes, pass the array to the menu execution function for testing, then once you are sure that the menu is functioning properly, you can choose to save it to the dictionary, an array file or source code.

In an Array-based architecture, all the sub-systems should be designed to retrieve configuration arrays by the fastest method possible. Let's take the example of the DC_MENURUN() function. This is the function that is used to retrieve a menu by its tag name and pass it to DC_MENUMAIN() for execution. To execute a menu with the name "MAINMENU", you would call the function DC_MENURUN("MAINMENU") in your source or in a code-block which may be evaluated from a hot-key, another menu, or a dialogue screen. DC_MENURUN() would process the menu as follows: * The DC_MENULOAD() function is called to retrieve the menu from it's source. This function has the job of returning a menu array like so:
1. First, the MENU CACHE is tested to see if the menu was previously loaded and stored in Cache memory. The size of the cache is determined by DC_MENUCACHE(). If the menu was called previously in the program and still exists in the cache, then it will be retrieved from the array cache. This is the FASTEST method of retreiving a menu array.

2. Second, the program is tested to see if a function named MENUMAIN_M() has been linked into the program. If it exists, then the function is called and its return value is passed back as the menu array. This is the SECOND FASTEST method of retreiving a menu array and does not open any files at all. If the function does not exist, then:

3. Third, the directory established as the "Dictionary Directory" is searched for a file named MENUMAIN.DCM. If it exists, then it is loaded to an array with the DC_ARESTORE() function and passed back as the menu array. This is the THIRD FASTEST method of retreiving a menu array and does not open any databases. It also provides the advantage of easily updating an application in the field by simply copying an array file into the dictionary directory. If the file does not exist, then:

4. Fourth, the DCMENU.DBF Menu Dictionary is searched for a menu with the tag name "MENUMAIN". If it is found, then it is loaded to an array, stored to the menu cache and passed back as the menu array. If not found, then an error message is displayed.

* The array retrieved by DC_MENULOAD() is compiled into an "executable" array that can be passed to DC_MENUMAIN(). Menu arrays must contain code blocks for executing functions. Code blocks cannot be saved to disk so they must be compiled first. The array is passed to DC_MENUCOMPILE() which in turn passes back a compiled array.

* The compiled array is passed to DC_MENUMAIN for execution..

In an Array-Based architecture, all Data-Driven sub-systems store and retrieve arrays in basically the same manner, however, the structure of each array will conform to the requirements for its respective sub-system. In the menu example above, the menu array would contain all the information needed to paint the menu on the screen such as menu prompts, sub-menus, help prompts, accelerator keys, code-blocks, access control, colors, and special behavior. The DC_MENUMAIN() function is designed to paint and execute the menu using the information in the array.

Data-Driven-Data-Driven Systems (or Meta-MetaData-Data)

SET SOAPBOX ON

Some developers of Data-Driven systems are proud of the fact that their systems are so finely-normalized that even the design tools, the API library, and the Meta-Data databases are all Data-Driven. I've heard remarks like: "my system drives itself" or "DDMan is written completely in DDMan. It's going to make code obsolete". I have met programmers who never seem to be happy until they can reduce an entire program down to a few lines of code that simply starts up an engine that launches a database that loads the rest of the program.

At first glance, this may seem like a good idea, but my experience with systems like this is that they are not only very slow performing, but they also rely very heavily on a methodology and a core set of data that is the "heart" or the "pulse" of the system. Like Communism, centralized systems eventually fail, become obsolete and unrepairable. Data-Driven systems that drive themselves have little or no foundation and eventually their structure will fall. Such systems are incredibly hard to maintain and to evolve because there's no place to stand when you're rebuilding the building.

A Data-Driven architecture that will stand the test of time is a "democracy" of sub-systems that can choose to work together or choose to stand alone. In my opinion, design tools should not be Data-Driven. The API library should be driven completely by code from the base language, not by the Data-Dictionary. This allows the developer to use any sub-system in a Data-Driven API library without dependency on every other sub-system. For example, dialog boxes and menus used by the API library can still use the same base functions as Data-Driven dialogs and menus, but their configuration data should come from code, not from a data file.

SET SOAPBOX OFF

Meta-Data Storage Techniques

The information in the arrays that drive an application must eventually be stored to disk. The array data can be stored to databases, array files, binary files, text files, repositories, or source code. A robust Data-Driven system will allow the programmer to customize the application engine to use the storage method that serves each specific need. For example, during the design process, it is usually best to store the meta-data in databases or repositories, whereas the delivered application could include the meta-data in array files, binary files, or even the executable program. As I said earlier, choosing a storage method is probably the most important decision the designer can make especially when issues of portability and maintainability are concerned. For example, there are numerous opportunities to paint yourself into a corner under an open-architecture system like CA-Clipper and CA-Visual Objects. Many third-party developers are creating data-driver systems with features galore, and lots of these features are finding their way into Data-Driven applications. For example, you can choose to store your Data-Dictionary meta-data in the form of "array-fields" or "blobs". What you must be wary of, however, is how you will maintain your application or migrate the application to a new platform when using these "non-standard" database systems. Developing a Data-Driven methodology is a huge committment that can pay off big dividends when wise choices are made. It can also leave you bankrupt and wondering "what happened?" when the wrong choices are made. Choose your design and your database drivers wisely. Resist the temptation to use a third-party database driver that may not be supported in the future. A well-designed Data-Driven API library will provide the option of replacing database drivers by simply re-linking the application. I am going to propose a variety of methods for storing application meta-data. No one method in considered to be better than any other method. In fact, every one of these techniques is used in our API library.

Using Databases for Meta-Data

This is the most popular method of storing Data-Dictionary information because there are so many database browsers and editors available to programmers. Applications that are stored in databases are usually self-defining because each field of each record is often self-explanatory. A simple browse of the data-dictionary database can usually tell a lot about the application and even a novice programmer can often make minor adjustments to the application by modifying the information in the database. For example, a DCMENU.DBF database might have the following structure:



         Field Name     Type      Length    Decimals  Position

         MENU_NAME      Character      8         0           1
         LEVEL          Character      6         0           2
         TITLE          Character     35         0           3
         TITLE_BLOC     Character    150         0           4
         MESSAGE        Character    250         0           5
         ACCEL_KEY      Numeric        4         0           6
         CODE_BLOCK     Character    150         0           7
         HELP_CODE      Character     22         0           8
         ACCESS         Character      3         0           9
         RETURNS        Numeric        4         0          10

Each menu item would be represented by a record in the database and would be grouped together by the MENU_NAME field. The structure of the arrays will dictate the structure of the database. For example, many sub-system arrays in a Data-Driven system will be consist of a "ragged" array made up of two or more sub-arrays. The first sub-array may be a single-dimensional array that contains "general" data about the object, whereas the second sub-array may consist of many more sub-arrays, one for each item in the object. In the case of a menu system, the first sub-array may be a fixed-length, single-dimensional array that contains information about the location of the menu on the screen, colors, menu options, etc. and the second sub-array will vary in length depending on the number of menu selections. This can complicate the structure of the database due to the non-symmetry of the array. This problem can usually be overcome by including a MEMO field or a CHARACTER field that will store the portion of the menu array that is not related to an individual menu item. For example, the FIRST RECORD of a menu group may be used to store all the general information about the menu including the general data sub-array. This can be accomplished by one of several methods:

* Convert the array to "tokens" and store a token-string.
* Convert the array to an "array-string" and store the array-string.
* Store the array as a "blob" using a variable-length field data-driver.
* Store the array as an "array-field" using a data-driver that supports array-fields.
A discussion of each of these array-storage techniques follows later in this seminar.

PROS

* Saving and restoring array information is simple and requires minimal coding in the application API library.
* Database structures can be easily modified to handle new system requirements.
* Corrupted Data-Dictionaries can be more easily repaired by common database utility programs.
* Databases provide better support for team (multi-user) development projects.

CONS

* Databases (and their indexes) require that the Data-Driven engine include a compatible data-driver.
* Databases (and their indexes) take longer to open.
* Databases (and their indexes) consume more conventional memory.
* Databases (and their indexes) consume more file handles.
* Databases eat up a lot of disk-space due to the "air" in the database consumed by empty fields.

Using Repositories for Meta-Data

Repository systems basically encapsulate all the meta-data into a single file with a proprietary structure. The system API library and/or engine store and retrieve the data from this file. ODBC databases are an example of a repository-style database system that can contain not only the meta-data but also the business data. Most repository systems are based around a centralized IDE (integrated development environment) as opposed to a modular design. PROS

* Repositories store the entire application in a single database.
* Repositories consume less file handles and take less time to open.

CONS

* Repositories cannot be browsed by common utility programs.
* Repositories can become easily corrupted.
* Repositories are not easy to manage in multi-user, team development projects.
* Repository systems require a complete integrated development environment to make even minor changes to the application.
* Repository systems do not integrate well into an open-architecture, modular design.
* Repositories cannot be easily modified to incorporate new design requirements.

Using Text or .INI Files for Meta-Data

Text files are basically ASCII files that contain meta-data information in some form that can be read by the application API library. Windows *.INI files are an example of a good system for storing "initialization" information for an application but they are not necessarily a good choice for Data-Dictionary information. All Data-Driven systems must contain some Public or Static data that defines the global behavior of an application. For example, a user-defined color-system would be designed around a public or static array that contains the default colors for menus, data-entry screens, browse-screens, etc. The array could then be populated by a function that reads a DCCOLOR.INI file that contains the following ASCII information:


[COLORSET_1]
BrowseFrame=N+/W
BrowseData=N/W
BrowseHeader=W+/RB
BrowseMenu=N/BG,W+/BG
BrowseSelect=GR+/R
BrowseUnselKey=W+/B
BrowseSelKey=W+/R
MenuFrame=N/BG
MenuItems=N/BG,W+/BG
MenuSelect=N/W
MenuGrayItem=N+/BG
DialogBoxFrame=B/W
DialogBoxCont=N/W
DialogBoxButton=N/BG
Says=W/B
PendingGets=N/BG
CurrentGet=GR+/R
MemoEditMode=W+/N
MemoMenu=N/BG
MemoMenuHilite=W+/BG
MemoFrame=N/W
MemoDispMode=W+/B
ColorSetName=Standard Color Set

Here are two functions, DC_INILOAD() and DC_INISAVE(), that can be used to support a system of *.INI files for saving or restoring any single-dimensional array.
is the name of the *.INI file to load.
is the Group Name you wish to load.
is the array to populate from the contents of the initialization group. This array must already exist and must contain default values for each item. Values may be the following types: (C)haracter, (N)umeric, (D)ate, (L)ogical.
is a parallel array that contains the same number of elements as . This array contains the name of each initialization item for each item in .



FUNCTION dc_iniload ( cIniFile, cGroup, aArray, aReference )

       LOCAL lExact := SET(_SET_EXACT), cValue, nFound, cParam,
             xValue, cString, nHandle, aRef, i, lFound

        cGroup := UPPER(IIF(Valtype(cGroup)=='C',cGroup,''))
        aRef := ACLONE(aReference)
        IF Left(cGroup,1)#'['
          cGroup := '['+cGroup+']'
        ENDIF
        nHandle := DC_TXTOPEN( cIniFile )
        SET(_SET_EXACT,.t.)
        FOR i := 1 TO LEN(aRef)
          aRef[i] := UPPER(aRef[i])
        NEXT
        lFound := .f.
        IF nHandle > 0
          DO WHILE !DC_TXTEOF(nHandle)
             cString := ALLTRIM(DC_TXTLINE(nHandle))
             DC_TXTSKIP(nHandle,1)
             IF !EMPTY(cGroup) .AND. cGroup#UPPER(cString)
               LOOP
             ELSEIF cGroup==UPPER(cString)
               cGroup := ''
             ELSEIF Left(cString,1)=='[' .AND. EMPTY(cGroup)
               EXIT
             ENDIF
             IF Left(cString,1)="*" .OR. Left(cString,1)='/'
               LOOP
             ELSEIF !("="$cString)
               LOOP             ENDIF
             IF '//'$cString
               cString := TRIM(SUBSTR(cString,1,AT('//',cString)-1))
             ELSEIF '/*'$cString
               cString := TRIM(SUBSTR(cString,1,AT('/*',cString)-1))
             ENDIF
             cParam := UPPER(Alltrim(SubStr(cString,1,AT('=',cString)-1)))
             cValue := Alltrim(SubStr(cString,AT('=',cString)+1))
             nFound := ASCAN( aRef, cParam )
             IF nFound = 0
               LOOP
             ENDIF
             xValue := aArray[nFound]
             IF Valtype(xValue)='C'
               xValue := cValue
             ELSEIF Valtype(xValue)='D'
               xValue := CTOD(cValue)
             ELSEIF Valtype(xValue)='N'
               xValue := VAL(cValue)
             ELSEIF Valtype(xValue)='L'
               xValue := cValue='YES'
             ELSEIF Valtype(xValue)='B'
               xValue := &(cValue)
             ENDIF
             aArray[nFound] := xValue
             lFound := .t.
           ENDDO
           DC_TXTCLOSE( nHandle )
         ENDIF
         SET(_SET_EXACT,lExact)
RETURN lFound

         /* ------------------------ */

FUNCTION dc_inisave ( cIniFile, cGroup, aArray, aReference )

    LOCAL nHandle, cString, aIni := {}, i, j, k, cValue, xValue, nIni

         SET(_SET_EXACT,.f.)
         cGroup := UPPER(IIF(Valtype(cGroup)=='C',cGroup,''))
         IF Left(cGroup,1)#'['
           cGroup := '['+cGroup+']'
         ENDIF
         nHandle := DC_TXTOPEN( cIniFile )
         IF Valtype(aArray)='C'
           aArray := { aArray }
         ENDIF
         IF Valtype(aReference)='C'
           aReference := {aReference}
         ENDIF
         nIni := 0
         IF nHandle > 0
           DO WHILE !DC_TXTEOF(nHandle)
             cString := ALLTRIM(DC_TXTLINE(nHandle))
             DC_TXTSKIP(nHandle,1)
             IF Left(cString,1) = '['
               AADD( aIni, { cString } )
               nIni := LEN(aIni)
             ELSEIF EMPTY(cString)
             ELSEIF nIni > 0
               AADD( aIni[nIni], cString )
             ENDIF
           ENDDO
         ENDIF
         DC_TXTCLOSE(nHandle)
         FOR i := 1 TO LEN(aIni)
           IF UPPER(aIni[i,1]) == cGroup
             EXIT
           ENDIF
         NEXT
         IF i > LEN(aIni)
           AADD( aIni, { cGroup } )
         ENDIF
         FOR j := 1 TO LEN(aArray)
           FOR k := 1 TO LEN(aIni[i])
             IF UPPER(aIni[i,k])=UPPER(aReference[j])+'='
               EXIT
             ELSEIF UPPER(aIni[i,k])=UPPER(aReference[j])+' '
               EXIT
             ELSEIF EMPTY(aReference[j])
               EXIT
             ENDIF
           NEXT
           xValue := aArray[j]
           cValue := ''
           IF Valtype(xValue)='C'
             cValue := xValue
           ELSEIF Valtype(xValue)='N'
             cValue := ALLTRIM(STR(xValue))
           ELSEIF Valtype(xValue)='D'
             cValue := DTOC(xValue)
           ELSEIF Valtype(xValue)='L'
             cValue := IIF(xValue,'Yes','No')
           ENDIF
           IF EMPTY(aReference[j])
           ELSEIF k <= LEN(aIni[i])
             aIni[i,k] := aReference[j]+'='+cValue
           ELSE
             AADD(aIni[i], aReference[j]+'='+cValue)
           ENDIF
         NEXT
         nHandle := Fcreate( cIniFile )
         FOR i := 1 TO LEN(aIni)
           FWRITE( nHandle, CHR(13)+CHR(10) )
           FOR j := 1 TO LEN(aIni[i])
             FWRITE( nHandle, aIni[i,j] + CHR(13)+CHR(10) )
           NEXT
         NEXT
         FCLOSE(nHandle)
    RETURN .t.

Using Array Files for Meta-Data

Array files contain array data in a "binary" format. In an array-based architecture, this is the simplest and fastest method of restoring configuration arrays from disk. An array-file Data-Dictionary is basically a set of individual files on the disk, one for each resource object required by the application. A naming convention for array files is usually required to ensure that the required resource object is restored by the application. For example, Menu array files could be given a .DCM extension, Data-entry array files could be given a .DCE extension, Browse array files could be given a .DCB extension, etc. The file name would be the same as the "tag name" of the resource object requested by the program. For example, the function DC_MENULOAD("MAINMENU") would return an array from the contents of a file named MAINMENU.DCM and the function DC_EDITLOAD("CUSTOMER") would return a data-entry array from the contents of a file named CUSTOMER.DCE.

Here are two functions, DC_ASAVE() and DC_ARESTORE(), that can be used to support a system of array files for saving or restoring any multi-dimensional array.

Storing Arrays to Database fields as Token Strings

When an array needs to be stored to a memo field or a character field, a common technique is the "token" method. This is widely accepted because tokenized data is easier to maintain with database editing utilities in the event that the database becomes corrupted. Converting an array to tokens is rather simple but is usually limited to arrays whose elements are all character strings. A tokenized character string is simply a set of data that is separated by a common delimiter character. My personal preference, when storing tokenized data, is to use the vertical bar "|" delimeter because when editing databases, it becomes more obvious that the information in the field is tokenized and should not be tampered with other than by the resource editor.

Here are two functions, DC_ARRAYTOKEN() and DC_TOKENARRAY() that can be used for converting tokenized data.



  FUNCTION dc_arraytoken( aTokens, cDelim )

  LOCAL i, nLength := LEN(aTokens), cString := ""
  FOR i := 1 TO nLength
    cString += aTokens[i] + IIF(i 0
        EXIT
      ENDIF
    NEXT
    IF nFound > 0
      AADD( aTokens, SubStr(cString,1,nFound-1) )
      cString := SubStr(cString,nFound+1)
    ELSE
      AADD( aTokens, cString )
      EXIT
    ENDIF
  ENDDO
  RETURN aTokens

Storing Arrays to Database fields as Linked-List Strings
Most array-objects in a Data-Driven system are multi-dimensional and may be constructed of sub-arrays that are non-symmetrical, ragged-symmetrical, parallel-symmetrical, or combinations. They may also contain elements of mixed types such as character, numeric, date or logical. No matter how complicated the array structure, it can be converted to a string and converted back to the original array (with the exception of code-blocks). Arrays that are converted to strings will be referred to as "string-arrays". These strings can be stored to a character type field or a memo type field in the database. If the array being converted always produces a string-array that has a fairly predictable length, then it may be wiser to store the string-array to a fixed-length character field. If it produces a string-array that can vary greatly in length then it is suggested that it be stored to a variable length memo type field. Here are two functions, DC_AR2STR() and DC_STR2AR(), that can be used for converting arrays for storage to memo fields. NOTE: This technique has been tested extensively with various array types and memo field drivers and will work with all .DBT memos, .FPT memos, and .DBV memos supported by CA-Clipper, CA-Visual Objects and third-party drivers.
FUNCTION dc_ar2str ( aArray, lHeader ) LOCAL cArray := '' lHeader := IIF(Valtype(lHeader)='L',lHeader,.f.) IF lHeader cArray := CHR(1)+'Array String:' ENDIF _dcStore( aArray, @cArray ) RETURN cArray /* ----------------- */ FUNCTION dc_str2ar ( cString ) LOCAL nPosition := 1, cArray := cString IF SubStr(cArray,1,14)==CHR(1)+'Array String:' cArray := SubStr(cString,15) ENDIF RETURN _dcGet( @nPosition, @cArray ) /* ----------------- */ STATIC FUNCTION _dcStore( xThing, cArray ) LOCAL cItem IF Valtype( xThing ) == "A" _dcarray( xThing, @cArray ) ELSE cItem := _dcitem( xThing ) IF Valtype( cItem ) = 'C' cArray += cItem ENDIF ENDIF RETURN nil /* ----------------- */ STATIC FUNCTION _dcArray( aArray, cArray ) LOCAL i, cItem, cL2bin := l2bin(len(aArray)) IF CHR(26)$cL2bin cArray += "O"+ DC_l2Dec(len(aArray)) ELSE cArray += "A"+ cL2bin ENDIF FOR i := 1 TO Len(aArray) cItem := _dcitem( aArray[i], @cArray ) IF Valtype( cItem ) = 'C' cArray += cItem ENDIF NEXT i RETURN nil /* ----------------- */ STATIC FUNCTION _dcItem ( xItem, cArray ) LOCAL cRetVal, cType := Valtype( xItem ), cL2bin DO CASE CASE cType == "C" cL2bin := l2bin( Len( xItem )) /* -- memo fields can't store CHR(26) -- */ IF CHR(26)$cL2bin cRetVal := "M"+DC_l2Dec( len( xItem)) + xItem ELSE cRetVal := "C"+cL2bin+xItem ENDIF CASE cType == "N" IF '.'$STR(xItem) xItem := STR(xItem) cRetVal := "F"+l2Bin( len( xItem)) + xItem ELSE cL2bin := l2bin(xItem) /* -- memo fields can't store CHR(26) -- */ IF CHR(26)$cL2bin cRetVal := "W"+DC_l2Dec(xItem) ELSE cRetVal := "N"+l2bin(xItem) ENDIF ENDIF CASE cType == "L" cRetVal := "L"+IIF(xItem, "T", "F") CASE cType == "U" cRetVal := "U" CASE cType == "D" cRetVal := "D"+l2bin( xItem - ctod("01/01/70") ) CASE cType == "B" cRetVal := "B" OTHERWISE _dcStore( xItem, @cArray ) ENDCASE RETURN cRetVal /* ----------------- */ STATIC FUNCTION _dcGet ( nPosition, cArray ) LOCAL nLength, i, cAttrib, cRetVal cAttrib := substr( cArray, nPosition++, 1 ) DO CASE CASE cAttrib $ 'CNADF' nLength := bin2l( substr( cArray, nPosition, 4 ) ) nPosition += 4 DO CASE CASE cAttrib == "C" cRetVal := substr( cArray, nPosition, nLength ) nPosition += nLength CASE cAttrib == "F" cRetVal := VAL(substr( cArray, nPosition, nLength )) nPosition += nLength CASE cAttrib == "N" cRetVal := nLength CASE cAttrib == "A" cRetVal := array( nLength ) FOR i := 1 TO nLength cRetVal[i] := _dcget( @nPosition, @cArray ) NEXT i CASE cAttrib == "D" cRetVal := ctod("01/01/70")+nLength ENDCASE CASE cAttrib = 'M' nLength := dc_dec2l( substr( cArray, nPosition, 12 )) nPosition += 12 cRetVal := substr( cArray, nPosition, nLength ) nPosition += nLength CASE cAttrib = 'W' nLength := dc_dec2l( substr( cArray, nPosition, 12 )) nPosition += 12 cRetVal := nLength CASE cAttrib == "O" nLength := dc_dec2l( substr( cArray, nPosition, 12 )) nPosition += 12 cRetVal := Array( nLength ) FOR i := 1 TO nLength cRetVal[i] := _dcget( @nPosition, @cArray ) NEXT i CASE cAttrib = 'L' cRetVal := if( substr(cArray, Position++,1) == "T", .t., .f.) CASE cAttrib $ 'UB' cRetVal := nil ENDCASE RETURN cRetVal /* ------------------- */ STATIC FUNCTION dc_dec2l ( cNum ) RETURN VAL(Substr(cNum,1,3))*1 + ; VAL(Substr(cNum,4,3))*256 + ; VAL(Substr(cNum,7,3))*65536 + ; VAL(Substr(cNum,10,3))*65536*65536 /* ------------------- */ STATIC FUNCTION dc_l2dec ( nNum ) LOCAL cVal := l2bin( nNum ) RETURN STRTRAN(STR(ASC(SubStr(cVal,1,1)),3) + ; STR(ASC(SubStr(cVal,2,1)),3) + ; STR(ASC(SubStr(cVal,3,1)),3) + ; STR(ASC(SubStr(cVal,4,1)),3),' ','0')

Storing Arrays to Database fields as Blobs or Array-Fields

It is becoming more and more popular to utilize the special features of third-party data-drivers and the new CA-Clipper 5.3 data-drivers for storing arrays to database fields. Many of these drivers support a new class of memo field that allows storage of any type of data. Some will allow only storage of character strings and arrays. These fields are commonly referred to as "Array-Field memos". Others will allow storage of BLOBS or "binary-large-objects". Blob-storage systems can handle any kind of data from a text file to an .EXEcutable program. For example, a complete set of array files can be stored to a blob database and re-written to disk. The FlexFile data-driver can handle blobs for both CA-Clipper 5.2 and 5.3. The DBFBLOB driver that ships in the box with CA-Clipper 5.3 can also perform this function.

Storing Arrays to Code

Over the years, many application generator systems have been designed around a "passive" Data-Dictionary. The entire application is created by a set of design tools that maintains the Data-Dictionary and then writes out source code based on a "template" for each language supported by the application generator. The code that is generated must be compiled and linked into an executable program using the appropriate language compiler and libraries. These application generators are incapable of producing "user-modifiable" applications because the end application must always consist entirely of compiled code. Writing the data-dictionary information to compiled code provides the advantage of much faster performance, less file-handles, better memory management, and the reliability of early-binding plus encapsulation of the entire application into an executable program. Unfortunately, most Data-Driven systems don't give the designer the option of committing desired portions of the Data-Dictionary to code and maintaining other portions of the application as user-definable. Probably the greatest advantage of an Array-Based architecture is the simplicity of converting any portion of the application to executable code. Since the live application is driven by array objects, the process of running an application is simply a system of loading arrays with meta-data and passing the arrays to the appropriate sub-system for execution. Up to now we have talked about loading the arrays from files like databases and text files, which are older ideas, or array files, which is a newer idea. Another new idea in Data-Driven designs is the concept of linking the Data-Dictionary into the executable program. This is a very simple concept to implement in an Array-Based architecture because the process is merely the conversion of any multi-dimensional array to source code that, when compiled and executed, returns the contents of the original array. For example, let's say you have designed an application that has been tested so well you haven't needed to modify the menu dictionary in months. You can decide to commit this dictionary, or portions of the dictionary to code by simply writing it out to a set of .PRG files, compiling the source to .OBJ files, then linking the .OBJ files into the executable program. Let's also say that the function you use to get a menu from the dictionary automatically tests to see if a function exists with the same name as the menu and the extension _M() before opening the menu dictionary. If it does, then it will call the function to return the menu array rather than extracting it from a file. Here's a little piece of code, based on functions in a Data-Driven API library that could be used to write out an entire menu dictionary to code:


aMenuNames := DC_MenuList()
FOR i := 1 TO LEN(aMenuNames)
   aMenu := ;
   DC_MenuLoad(aMenuName[i])
   DC_Array2Prg( aMenu, ;
     aMenuName[i], ;
     aMenuName[i]+"_M()")
NEXT

Here's a function, DC_ARRAY2PRG(), that will convert any array to source code:


 
FUNCTION dc_array2prg ( aArray, cFileName, cFunction )

  LOCAL nHandle, cSaveScrn, nChoice := 1, aOutArray := {"",""},;
        nLevel := 2

  cFileName := IIF(VALTYPE(cFileName)='C',cFileName,'')
  IF !('.' $ cFileName )
    cFileName += '.PRG'
  ENDIF
  nHandle := FCREATE( cFileName )
  aOutArray[1] += 'FUNCTION ' + cFunction + CHR(13)+CHR(10)
  aOutArray[1] += 'LOCAL a'+CHR(13)+CHR(10)
  IF !EMPTY(cCode)
    aOutArray[1] += cCode
  ENDIF
  aOutArray[1] += ;
     '_dcaprg'+ALLTRIM(STR(nLevel))+'(@a)'+CHR(13)+CHR(10)
  aOutArray[nLevel] += ;
    'STATIC PROCEDURE _dcaprg'+ALLTRIM(STR(nLevel))+'(a)'+;
     CHR(13)+CHR(10)
  _dcar2prg ( aArray, nHandle, '', aOutArray, @nLevel )
  aOutArray[nLevel] += 'RETURN' + CHR(13)+CHR(10)
  aOutArray[1] += 'RETURN a'+REPL(CHR(13)+CHR(10),2)
  FOR nLevel := 1 TO LEN(aOutArray)
    FWRITE( nHandle, aOutArray[nLevel] )
  NEXT
  FCLOSE(nHandle)
  RETURN nil

   /* -------------------- */

  STATIC PROCEDURE _dcar2prg ( aArray , nHandle, cElement, ;
                               aOutArray, nLevel )

  LOCAL  nElement, cTextLine, cType, nArrayLen, cValue, i, n, ;
         nLength, cDelim, j, cChar, cNewText, cNewLine

  nArrayLen := LEN(aArray)
  aOutArray[nLevel] += 'a'+cElement+' := ARRAY('+;
         ALLTRIM(STR(nArrayLen))+')'+CHR(13)+CHR(10)
  FOR nElement := 1 TO nArrayLen
    IF VALTYPE(aArray[nElement])='U'
      LOOP
    ENDIF
    cValue := aArray[nElement]
    cType := VALTYPE(cValue)
    cTextLine := ''
    DO CASE
      CASE cType='C'
        cDelim := '"'
        IF LEN(cValue)=0
          cTextLine := cDelim + cDelim
        ELSE
          FOR i := 1 TO LEN(cValue) STEP 100
            cNewLine := SubStr(cValue,i,100)
            IF CHR(13) $ cNewLine .OR. CHR(10) $ cNewLine .OR. ;
              '"' $ cNewLine
              cNewText := ''
              FOR j := 1 TO LEN(cNewLine)
                cChar := SubStr(cNewLine,j,1)
                IF cChar=CHR(10)
                  cNewText += cDelim + '+CHR(10)+'+ cDelim
                ELSEIF cChar=CHR(13)
                  cNewText += cDelim + '+CHR(13)+'+ cDelim
                ELSEIF cChar='"'
                  cNewText += cDelim + '+CHR(34)+'+ cDelim
                ELSE
                  cNewText += cChar
                ENDIF
              NEXT
              cNewLine := cNewText
            ENDIF
            cTextLine += cDelim + cNewLine
            IF i < LEN(cValue) .AND. LEN(cValue)>100
              cTextLine += cDelim + '+;'+CHR(13)+CHR(10)
            ELSE
              cTextLine += cDelim
            ENDIF
          NEXT
          IF RIGHT(cTextLine,1)=chr(10)
            cTextLine += '   ' + cDelim + SUBSTR(cValue,i,100) + ;
                      cDelim
          ENDIF
        ENDIF
      CASE cType='N'
        cTextLine := ALLTRIM(STR(cValue))
      CASE cType='D'
        cTextLine := 'CTOD("'+DTOC(cValue)+'")'
      CASE cType='L'
        cTextLine := IIF(cValue,".T.",".F.")
      CASE cType='A'
        _dcar2prg( aArray[nElement], nHandle, ;
        cElement+'['+ALLTRIM(STR(nElement,4))+']',aOutArray,@nLevel ) 
        LOOP
      OTHERWISE
        LOOP
    ENDCASE
    aOutArray[nLevel] += ;
      'a'+cElement+'['+ALLTRIM(STR(nElement))+'] := ' ;
      + cTextLine+CHR(13)+CHR(10)
    IF LEN(aOutArray[nLevel]) > 10000
      aOutArray[nLevel] += 'RETURN' + REPL(CHR(13)+CHR(10),2)
      nLevel++
      aOutArray[1] += ;
        '_dcaprg'+ALLTRIM(STR(nLevel))+'(@a)'+CHR(13)+CHR(10)
      AADD( aOutArray, 'STATIC PROCEDURE _dcaprg' + ;
            ALLTRIM(STR(nLevel))+'(a)' + CHR(13)+CHR(10) )
    ENDIF
  NEXT
  RETURN

Performance Issues

The number one concern on the mind of developers who are either considering Data-Driven systems or have experience with Data-Driven systems is the issue of performance. They worry that they will make a huge investment in a technology only to find out, too late, that it won't live up to performance expectations. Don't let your own past experiences or what you have heard from others decide the issue for you until you have test-driven some Data-Driven systems on the hardware you intend to use for your applications. Like I said earlier, the CA-Clipper language and today's computer systems just don't behave like systems you may have experienced in the past. In addition, Data-Driven systems architects have learned a lot about performance issues and have matured in their design approach to utilize techniques that weren't available or were unknown in the past. I have already discussed concepts that you can use in your code to fine-tune an application to perform optimally in the section titled Using an Array-Based Architecture, however I am going to cover this issue in some more detail and give you some more hints for speeding up your Data-Driven applications. Most of the performance hits on Data-Driven applications occur at the time the Meta-Data is being loaded into the resource arrays.

1. Use hard-code whenever possible. A well designed API library will use hard-coded sub-systems to actually run the resource and extract the custom information from a LOCAL array that was previous loaded from the Data-Dictionary. This type of design will cause the menu, data-entry, browse, screen, etc. resource to perform as well as if the entire system was hard-coded.

2. Use an array cache. An array cache is simply a static array that is used to store other arrays when they are requested to be loaded from the Data-Dictionary. Such a system should also be smart enough to manage the cache to give priority to arrays that are requested most often and not flush them from the cache when the cache memory limit is reached.

3. Compile the Array-Data into the executable program. See the section titled Storing Arrays to Code for more information.

4. Use workstation disk resources for Data-Dictionary files. In multi-user systems, the databases are stored on a server (along with the business data) to insure that every workstation is running the same application. The system can be designed with an option to automatically copy those Data-Dictionary files to the local hard-drive that have a newer date/time stamp. Accessing data from a local drive is much faster than accessing from a server.

5. Keep Data-Dictionary databases open. If the system has plenty of memory and is running in protected mode, then it is a good idea to keep all the Data-Dictionary files and indexes open during the running of the application. A system configuration flag can be set to enable this feature. This is especially important when using data-drivers like Advantage xBase Server due to it's slow file opening characteristics.

6. Use a 386 protected mode linker for CA-Clipper applications. It's a good idea to link the delivered executable with a 386 mode linker, like CauseWay, due to their improved performance over 286 mode linkers.

7. Provide plenty of system memory. Make lots of EMS available to real mode applications and lots of DPMI memory available to protected mode applications. The CA-Clipper virtual memory manager will start creating swap files when all the array memory is used up. This will greatly hamper performance.

Multi-User Issues

CA-Clipper and CA-Visual Objects are very good multi-user languages and their data-driver systems support the record locking and file locking required for multi-user applications. I will assume that everyone attending this seminar is already familiar with the RLOCK() and FLOCK() functions and that you always write code that opens databases in shared mode, lock records before updating, and commit changes to disk before unlocking the record. Following these simple rules will ensure that your Data-Driven applications will work in a multi-user environment.

In most data-entry systems, it is common to use scatter/gather techniques to prevent the need to keep a record locked during data entry. For example, the data is read into an array or a set of local memvars, the GETS and validations are performed on the temporary data, then the changes are written back to the record. This technique requires that the record be locked only for the few milliseconds required to REPLACE the data. Unfortunately, this technique can complicate the design of "validation" schemes. In a Data-Driven system, the validation system often includes expressions that are evaluated at the time the user completes entry into a field. Many of these validation routines return a logical value based on the data entered into the current field and other fields. When designing validation expressions, it is usually desired to use field names in the expressions, meaning that current data from the database will be used in determining if the data has been entered correctly. In a scatter/gather system, the data hasn't been written to the record yet so there is no way to validate based on a simple expression. Data-Dictionary architects have employed a variety of creative techniques to overcome this problem, however the simplest and probably the most common practice is to just forget the idea of scatter/gather and keep the record locked during the entire data-entry process for that record. This technique can simplify the design but it can also cause problems when a user walks away from his/her workstation or clicks on another window while in the middle of a data-entry process. If your system is designed this way, then consider putting a check in the key-handler idle loop for an inactivity time-out period. This is a common practice for invoking screen savers or automatic logoff routines. It can also include a check to see if the current record is locked and can unlock the record and restore the lock again when the user returns to the application.

Another issue that complicates multi-user applications is the "configuration" problem. Each user logs on from a different work-station and has different access rights. A Data-Driven system must be designed to handle user-customization such as user-defined colors, access-control, printer-drivers, modem ports for dialers, user-directories, etc. This is usually handled by a user dictionary database that contains one record for each user and an API library with a log-on function that requests a password and stores the user data into static arrays for access by the system. Most configuration issues are straight-forward and can be handled by a field in the user database for each configurable item. Some issues, like color-systems, however can complicate the problem especially if the system supports a lot of colors. If the user dictionary database contains one field for each color, then you can waste value symbol-table memory by a bunch of fields that are accessed only once, during startup. A better idea is to store the entire color array into a memo field in the user database using the DC_AR2STR() function. Another idea is to place configuration *.INI files or array files into a special directory for each user on the file server and have a field in the user dictionary database that points to this directory.

Data Integrity

A Data-Driven API library should be designed to utilize all the "common" features of the CA-Clipper or CA-Visual Objects replaceable data-driver system and to provide database, index, and referential integrity. For example, a File Dictionary would be used to define the database and index files to open as a related group. The dictionary system should also establish the RDD (replaceable data-driver) to use for the files. This provides the developer with the option of choosing database and index drivers for dBASE, FoxPro, Clipper, Paradox, etc. by simply linking the respective drivers into the executable program. The Field Dictionary would be used to define the fields in each database. It is the job of the API library to use the File and Field Dictionary meta-data when opening files and perform the following functions:

* Use the properly designated RDD for the work group being opened.
* Verify that the databases being opened match the field definitions in the field dictionary.
* Verify that the indexes being opened match the index key and tag name information in the file dictionary.
* Create new databases from the RDD and field information if the database doesn't exist.
* Create new indexes from the RDD and index information if the index doesn't exist or if it is corrupted.
* Establish all the Parent/Child relations for each work group.
* Establish the referential integrity rules for relational databases.
* Establish Key Business Rules, define Domains and establish database triggers to trigger events such as automatically deleting child records when a parent record is deleted.

Needless-to-say, performing all these tasks in a data-driven application (or any application) is not an easy programming job and takes time to develop a good set of functions and a methodology. It is impractical to list source code for this part of a Data-Driven application because it is a finely integrated process and cannot be described in simple terms. Any third-party Data-Driven library product that includes source code is the best source for this information.

Memory Management

Data-Driven systems are often notorious for using up all the memory resources of a computer system because of the amount of "air" or "dead-space" that exists in the databases and/or the arrays. Data-Driven, Array-Based systems may be wonderful in architecture, but they can consume too much memory if the application programmer gets carried away with a design that opens too many databases, instantiates too many data-entry or browse windows, or nests too many sub-system calls.

It's good to remember that an average Data-Driven system will usually consume twice as much memory as an average hard-coded system with the same functionality. Memory may be getting cheaper, but that doesn't necessarily mean that more of it is becoming available to the database application. End-users are demanding that their database applications reside in memory at the same time as their Windows applications yet they refuse to purchase the memory needed for such an environment.

Data-Driven systems should be designed to test the amount of memory available during the start of an application, then limit the amount of tasks that can be active at any time based on the memory available rather than allowing the system to crash with an "out of memory" error. Forcing garbage collection just before opening files and/or loading arrays will prevent memory fragmentation. This can be accomplished by using the FT_IDLE() function from the public-domain Nanforum Toolkit library. A system can be incorporated in the API library that forces Data-Dictionary files to be closed and array-caches to be flushed if available memory drops below a pre-established amount. The system should never access Data-Dictionary databases other than to load arrays to eliminate the need to keep the files open.

Event-driven systems are starting to dictate the expected behavior of modern database applications now that the Windows paradigm has become ingrained in our psyche. Unfortunately, this concept wreaks havoc with a computer's memory resources because each instantiation of a form or a window must allocate a large block of static memory. If no limits are placed on the user and he/she is allowed to freely open as many data-entry and browse windows as he/she wants, the system will eventually crash. When designing a data-driven system, the programmer should place restrictions on the number of instantiations allowed. In most non-event-driven systems, this memory-utilization limitation is usually automatic (provided that the arrays are LOCAL or PRIVATE), because the allocated memory is automatically released when the user exits from the task. Programmers are under much more pressure these days, however, to maintain data-entry and browse configurations in STATIC or PUBLIC arrays to allow the user to easily navigate between different work areas and data-entry screens. A system that is designed this way is wonderful for the user, but a nightmare for the programmer.

Databases use one symbol in the symbol-table for each database field. Applications with many databases and lots of fields can consume so much conventional memory that they can cause a "conventional memory exhausted" error unless this small memory pool is well-managed. If all fields in the application are pre-defined and declared to the compiler, then the fields are added to the symbol table at link time rather than at run time. This early-binding improves the memory characteristics of the program. It isn't usually practical to do this in Data-Driven applications because it tends to defeat the purpose of Data-Driven, however, it really is not at all difficult to write a function that will traverse the field dictionary file and write a .PRG file that contains nothing but FIELD declarations. The file can then be compiled and linked into the application engine to help resolve runtime memory problems.

Documentation, Modularity and Conventions

I have inherited many projects over the years, and have rarely picked up a project that included any documentation at all. Any Data-Driven system that lacks a good documentation system or does not provide a modular API-library can be very difficult to use or maintain. When developing a system, write the system as though it will be sold and supported as a library product rather than simply used for in-house projects. This will force the programmer(s) to establish high standards for modularity, documentation and maintainability and will greatly increase the probability of a successful project. Establish a system of programming conventions, then document and enforce the conventions among the programming team.

CONCLUSION

Developing and maintaining a Data-Driven system can be challenging and exciting. The project requires a skilled programmer with vision, committment and patience. If you possess these qualities, then you will be rewarded by your development efforts. Don't hesitate to spend a few hundred dollars looking at third-party Data-Driven products. Even if you decide to develop your own system, a third-party system will provide you with many good ideas and save you thousands of dollars in development costs.