Need For Data Warehousing
Need For Data Warehousing
There are numerous reasons why the organization needs to run a data warehouse. In any organization, the most obvious reason for running a data warehouse is to separate analytical workloads from production systems (Woodie, 2012). As a result, the data warehouse will minimize chances running a vast query at any given time will result to congestions or downtimes for the organization’s transaction processing capability. In other words, the data warehouse will bring consistency to the production system of the organization since only copies of production system in the data warehouse are queried. For this company, the main transactions involve the collection and analysis of data. The company has previously had a small workforce capacity of 20 employees. However, the company is currently seeking expansion which means a larger workforce and customer base. With such changes, the organization will need to upgrade its current software to improve on its efficiency. In the course of change, the company will not have to shut down to implement these changes (Woodie, 2012). On the contrary, the company has to upgrade the software and implement the data ware house to ensure that all data at all time is consolidated irrespective of its source like SQL server or Oracle database.
Apart from improving the efficiency of the organization, the data warehouse will improve the organization’s decision making capabilities. With a data warehouse, an organization is equipped with a reliable source of data. Within the organization, the best decisions are made in the presence of relevant and up to data. Decisions made in such a way are best for dealing with challenging times. For a company specializing in data collection and analysis, a data warehouse will be relevant for data validation, reformatting, reorganization, summarizing, restructuring, and supplementation with data from other source (Woodie, 2012). The resulting stored data can be used in generating reports, dashboards and portals. The generation of this information is relevant to the changing organization given that valid and better decisions will be achieved by the organization compared to the competitors. In a competitive market, organizations with access to updated and actionable information gain a competitive edge against competitors. Access to current and valid data is a critical tool towards analyzing current and long term trends while allowing the organization to access continuous feedback on decision effectiveness and alerts on opportunities and problems.The best practices of data warehousing that will be implemented by the organization will incorporate business intelligence. Business intelligence has been adopted by large organizations, which view it as the passing through a data warehouse (Woodie, 2012). Business intelligence comprises of processes, technologies, and tools that are needed in the conversion of data to information to comprehension, and comprehension into plans necessary to drive business action to profitability. When connected to business intelligence, data warehousing is a source of data for high profile applications and tools interacting with the user.
This organization will also focus on data reorganization. Reorganization will involve the creation of data marts. These marts will contain information from single subject areas like sales (Ponniah, 2011). With the data marts, the database administrator can change the database design to make it easy to query and not easy to the process transaction. With an exemplary data mart design, the field and filenames used are easy to understand. This is an added advantage as cryptic names used in most databases pave the way for descriptive names. With a data warehouse, data is designed for reporting. As a result, such data is easier to work with, and the database is easy to understand even for users. Predigesting will also be included to assist the organization in anticipation of the types of inquires and reports that can be requested from raw data in the data warehouse (Ponniah, 2011). This means that metadata will be developed and stored. Metadata will consist of elements like new fields like averages, deviations, and summaries obtained from source data. For this organization, a variety of useful metadata fields will be applied so as to support valid reporting and analysis. In the operational data warehouse, the data model will be designed to remain stable. Stable data models require no changes on the generated reports since there are not irregular changes to be accommodated. In addition, new metadata and data fields that are added over time have to be done in a way that calls for no rewriting of reports (Ponniah, 2011).Other best practices in building the data warehouse are the establishment of data integrity check points into the source system, offering continued training, and utilizing checks into the system. The data warehouse will also be designed to use tables and index partitioning.
From the diagram, the company represents the producers of a given product. For instance, Toyota company producing cars of different model. The company represents the table name while its identification number and name represent the fields. In order to understand consumer trends for this company, relationships must be created between the company and the consumers. Other relationships to the company are the company product and company competitors. The company product say premio has data collected in terms of its buyers and competitors. The car buyers are then analyzed in terms of region, sex and other demographic information. The data collected here is relevant in terms of identification of consumer trends, preferences and tastes. This information is, however, coupled with information on competitors. Competitors are analyzed for features that differentiate their products. The company product ranking offers relevant information in terms of the most preferred products. With the data warehouse, metadata will be formed for company products, consumers, and competitors. Data from the com_cons, com_prod, and com_compe will be copied to the data warehouse. This information will be analyzed for trends in consumer tastes. The product with the highest number of consumers will provide information on the best selling product. This product will be analyzed for its company and description.
The attainment of specific information about a product will be made possible through the use of the view query which will be run through a SELECT call. View query can be identified to avoid conflict with base tables by a name such as v_Product (O’Neil & O’Neil, 2001). This way, a query that uses the view can be read. A view query makes it easy to control data access for different user types. For instance, a person may decide to display the car models stored in a company database. This can be achieved by creating a weapons view sorted by model name. Through the use of indexes, a person finds records within the table more quickly. Indexes are built on one or more table columns. For a given field, the index maintains the list of values sorted in descending or ascending order. Indexes will make it easy to access rows in order of the index. The values of indexes can be unique or non unique. Unique indexes allow each row to be represented by a unique value. On the other hand, non unique indexes allow indexed columns to have replica values. In some database management systems like Oracle, unique index is established on the primary key columns and no null indexes are allowed on these columns (O’Neil & O’Neil, 2001).
From the diagram, the company has customers, products, and competitors. These three elements are required to define the company to be analyzed. Each company product is then defined in terms of its price, value, and description. This implies that a company has a product that is defined in terms of other attributes like the price, date sold, region sold, and others. The product attributes can be used by both of for analyzing it in terms of technology use and other features affecting consumer trends. The company also has consumers. Consumers are defined in terms of their location, sex, age, hometown among others. With such information, this organization can analyze a consumer patterns based on their location, purchased products, date of purchase, region and others. The regions with most customers for a given product can be provided to the company producing them for their ready markets. Again, products mostly bought by young people can be analyzed for their use of technology and other contemporary features attracting their attention. Sex is crucial in identifying trends on products and services mostly preferred by men or women. For instance, one can say that company y has product x that is mostly associated with women. One can proceed further to say that company y has product x which has characteristics a, b, and
The relationship HAS indicated possession. A company has three significant attributes which are customers, products and competitors. The relationship between the company and its elements is a one to-many relationship (O’Neil & O’Neil, 2001). These relationships are incredibly common relationships in databases. The fundamental inspiration is that all rows stored to the left side of the relationship are associated with undefined number of rows in the right side of the relationship. This relationship is referred to as 1: n where n represents undefined value. In this form of relationships, data can be normalized in the tables such that a person does not have to store redundant data. In addition, it is possible to store NULL columns.
In this organization, information flows from the customer to through the organization to different service or product providing companies. Input data is collected from customers who use organizational website to fill in questionnaires about products they have purchased. Customers can the manufacturing company’s website to give feed back. From different organizations’ databases, information about customers is directed to different marts depending on the product or service category. On the other hand, information about customers to different products and services can be collected through the company’s website, questionnaires, interview and other surveys. Information collected this way is reliable given that it is obtained from clients during interviews or questionnaires distribution participants are randomly selected to avoid bias. In addition, data collection online is open to persons from different nations, races and religion.
The consumer is the external entity. Information is obtained from the customer from online fill in forms or from questionnaires and interviews whose data is captured online. The forms and data keyed in are then inspected to validate them upon meeting laid down criteria. Customer information then input and stored for use in com-cons table. With customer information, data evaluation takes classify information in terms of product and company producing the product or service. For different identified products, the organization seeks for the description from the manufacturer and classifies them into different categories for instance, soaps, cooking oils, cars, laptops and others. Under laptops, the qualities of each are then defined. Customers’ preferences and tastes are determined by the output of the category with the most customers. Such information is then provided to the laptop companies upon consultation. Poorly performing companies can be provided with information on areas that need improvements and areas to eliminate, technologies to adapt others.
References
O’Neil, E., and O’Neil, P., (2001). Database principles, programming, and performance. Morgan Kaufman Publishers. USA.
Ponniah, P., (2011). Data warehousing fundamentals for IT professionals. John Willey & Sons, Inc. USA
Woodie, A., (2012). Why you need a data warehouse. Retrieved from http://www.itjungle.com/fhs/fhs042412-story03.htmls
Is this your assignment or some part of it?
We can do it for you! Click to Order!