Existing source systems

Understanding what data is available is an important step in creating a logical data model. Existing data is usually abundant, consisting of a large number of facts and attributes. You must determine what facts and attributes in the existing data are necessary for supporting the decision support requirements of your user community.

While a review of your data is initially helpful in identifying components of your logical data model, you may not find all the facts and attributes to meet your needs within the data itself. The existing data should suggest a number of facts, attributes, and relationships, but a substantial portion of the work in creating a suitable logical data model involves determining what additional components are required to satisfy the needs of the user community.

For example, an insurance company’s transactional system records data by customer and city, but the business analysts want to see data for different states or regions. State and region do not appear in the existing source data and so you need to extract them from another source. Additionally, although data is stored at a daily level in the source system, users also want to see data at the monthly or yearly level. In this case, you can plan additional attributes to provide the levels at which you intend to analyze the facts in your data model.

Although some data may not exist in a source system, this does not mean that it should not be included in the logical data model. Conversely, everything you find in the source data does not necessarily need to be included in the logical data model. User requirements should drive the decision on what to include and what to exclude.