The Logical Data Model » Building a logical data model » Converting source data to analytical data

Converting source data to analytical data

If there are no existing systems and you are just beginning your data warehousing initiative, you can build the logical data model based heavily on current user requirements. However, most logical models begin with an examination of the source data once existing systems are developed and implemented. The source data usually has some sort of documented physical structure. For example, most OLTP systems have an entity relationship diagram (ERD). An ERD provides a graphical representation of the physical structure of the data in the source system, which lets you easily recognize tables and columns and the data stored in those columns.

A logical data model is similar in concept to an ERD. However, in this guide the logical data model also takes into account how your data can be integrated into MicroStrategy to develop a business intelligence solution.

Whether you start from nothing or have an existing source system to use, the steps to create a logical data model are as follows:

Step 1: Identify the facts
Step 2: Identify the attributes
Step 3: Determine attribute relationships
Step 4: Define hierarchies

The details in these steps are related to using an existing source system.

Step 1: Identify the facts

Using your existing data, make a list of all data that can be represented as facts in MicroStrategy. Remember that facts can be calculated and are usually numeric and aggregatable, for example, sales and profit figures. After you have all the facts listed, determine the business level at which each fact is recorded. For example, in retail models, sales facts are often stored at the store, item, or day level, meaning that a sale takes place in a particular store, for a particular item, on a particular day. A product inventory fact, however, can be stored at the region, item, or week level. These business levels become the attributes in your logical data model (see Step 2: Identify the attributes).

Step 2: Identify the attributes

Uncover attributes by considering the levels at which you would like to view the facts on your reports. Start by looking at the levels at which each fact is recorded and build from there.

For example, in the existing data there may be fact data recorded only at the day level. However, your users are interested in analyzing data at more than just at the day level. They also want to view their data at the year, month, and week levels. This information may only be apparent to you after you deploy your project and you determine that a high percentage of your users are viewing sales data at the yearly level. This analysis requires MicroStrategy to aggregate the sales data from the day level to the year level. To improve performance and meet the requirements of the majority of your users, you can include an aggregate table that stores sales data at the year level (see Using summary tables to store data: Aggregate tables). You can then design a Year attribute for your project. This practice is sometimes a reaction to user requirements established after project deployment, but such considerations should be taken into account during your initial project design initiative.

Be careful not to include more facts and attributes than necessary. It is usually unnecessary to bring all data from the source system into the analytical environment. Only include facts and attributes that can serve your user community. Logical data modeling is an iterative process; if necessary, you can always add more attributes and facts later.

Step 3: Determine attribute relationships

Once you have identified your data to be defined as attributes in MicroStrategy, you must then determine which attributes are related to each other. For example, in a project, opportunity information is stored with an Opportunity attribute which is directly related to the attributes Opportunity Close Date, Opportunity Open Date, Primary Competitor, and so on. These attributes are all related to the Opportunity attribute because they all answer questions about opportunity information.

Additionally, you should determine the type of relationship. For example, in the diagram below, Year has a one-to-many relationship to Month, and Month has a one-to-many relationship to Day. This one-to-many relationship specifies that, for every year, several months exist, and for every month, several dates exist. From the reverse perspective the same relationship specifies that, for a number of dates (in a form such as 12/01/2005), only one month exists (in a form such as Dec 2005), and for a number of months, only one year exists.

This example may not accurately define how you store time information. Consider the Year to Month attribute relationship type of one-to-many. If you define the attribute Month as simply the month name (Dec, Jan, and so on) and not directly connected to a year (Dec 2005, Jan 2006, and so on) then the relationship would become many-to-many.

If you have documentation for the existing data, such as an ERD, it is likely that the documentation provides some additional details about the nature of the data and any inherent relationships.

Attribute relationships are discussed in detail in Attribute relationships.

Step 4: Define hierarchies

Hierarchies provide a structure for your data and can help your users easily and intuitively browse for related attributes and include them in a report. In the context of a logical data model, think of hierarchies as logical arrangements of attributes into business areas. For example, you can organize all time-related attributes into the Time hierarchy. You can have a Customer hierarchy containing all attributes related to your customers and a Supplier hierarchy for all attributes related to supplier data.

Depending on the complexity of your data and the nature of your business, you may have very few hierarchies or you may have many. It is possible that all the data is directly related, in which case you may have one big hierarchy. Again, the requirements of your user community should help you determine what hierarchies are necessary.