written 5.5 years ago by |
The environment for data warehouses and marts includes the following:
- Source systems that provide data to the warehouse or mart
- Data-integration technology and processes that prepare the data for use
- Different architectures for storing data in an organization’s data warehouse or data marts
- Different tools and applications for the variety of users.
- Metadata, data-quality, and governance processes that ensure that the warehouse or mart meets its purposes.
Source Systems
There is typically some “organizational pain” (i.e., business need) that motivates a firm to develop its BI capabilities. Working backward, this pain leads to information requirements, BI applications, and source system data requirements.
The data requirements can range from a single source system, as in the case of a data mart, to hundreds of source systems, as in the case of an enterprisewide data warehouse.
Modern organizations can select from a variety of source systems: operational/transactional systems, enterprise resource planning (ERP) systems, Web site data, third-party data (e.g., customer demographic data), and more.
The trend is to include more types of data (e.g., sensing data from RFID tags). These source systems often use different software packages (e.g., IBM, Oracle) and store data in different formats (e.g., relational, hierarchical).
Data Integration
In addition to storing data in their source systems, organizations need to extract the data, transform them, and then load them into a data mart or warehouse. This process is often called ETL, but the term data integration is increasingly being used to reflect the growing number of ways that source system data can be handled.
For example, in some cases, data are extracted, loaded into a mart or warehouse, and then transformed (i.e., ELT rather than ETL).
Data extraction can be performed either by handwritten code (e.g., SQL queries) or by commercial data-integration software. Most companies employ commercial software.
This software makes it relatively easy to specify the tables and attributes in the source systems that are to be used, map and schedule the movement of the data to the target (e.g., a data mart or warehouse), make the required transformations, and ultimately load the data. After the data are extracted they are transformed to make them more useful.
Storing the Data
A variety of architectures can be used to store decision-support data. The most common architecture is one central enterprise data warehouse, without data marts. Most organizations use this approach, because the data stored in the warehouse are accessed by all users and represent the single version of the truth.
Another architecture is independent data marts. This architecture stores data for a single application or a few applications, such as marketing and finance. Limited thought is given to how the data might be used for other applications or by other functional areas in the organization. This is a very application-centric approach to storing data.
Metadata
It is important to maintain data about the data, known as metadata, in the data warehouse.
Both the IT personnel who operate and manage the data warehouse and the users who access the data need metadata.
IT personnel need information about data sources; database, table, and column names; refresh schedules; and data-usage measures.
Users’ needs include data definitions, report/query tools, report distribution information, and contact information for the help desk.
Data Quality
The quality of the data in the warehouse must meet users’ needs. If it does not, the data will not be trusted and ultimately will not be used.
Most organizations find that the quality of the data in source systems is poor and must be improved before the data can be used in the data warehouse.
Some of the data can be improved with data-cleansing software, but the better, long-term solution is to improve the quality at the source system level.
This approach requires the business owners of the data to assume responsibility for making any necessary changes to implement this solution.
Governance
To ensure that BI is meeting their needs, organizations must implement governance to plan and control their BI activities. Governance requires that people, committees, and processes be in place.
Companies that are effective in BI governance often create a senior-level committee comprised of vice-presidents and directors who (1) ensure that the business objectives and BI strategies are in alignment, (2) prioritize projects, and (3) allocate resources. These companies also establish a middle management–level committee that oversees the various projects in the BI portfolio to ensure that these projects are being completed in accordance with the company’s objectives.
Finally, lower-level operational committees perform tasks such as creating data definitions and identifying and solving data problems. All of these committees rely on the collaboration and contributions of business-unit personnel and IT personnel.
Users
Once the data are loaded in a data mart or warehouse, they can be accessed. At this point the organization begins to obtain business value from BI; all of the prior stages constitute creating BI infrastructure.
There are many potential BI users, including IT developers; frontline workers; analysts; information workers; managers and executives; and suppliers, customers, and regulators.
Some of these users are information producers whose primary role is to create information for other users. IT developers and analysts typically fall into this category.
Other users-including managers and executives-are information consumers, because they utilize information created by others.
The benefits of data warehousing include the following:
- End users can access needed data quickly and easily via Web browsers because these data are located in one place.
- End users can conduct extensive analysis with data in ways that were not previously possible.
- End users can obtain a consolidated view of organizational data.
These benefits can improve business knowledge, provide competitive advantage, enhance customer service and satisfaction, facilitate decision making, and streamline business processes.