What is Data Mining ?

Data mining is a process used by companies to transform raw data into useful information. The term is widely used in computing and comes from the English language, which translated into Portuguese means data mining .
Generally, data mining , sometimes called data or knowledge discovery, is the process of analyzing data from different perspectives and summarizing them into useful information. Information that can be used to increase revenues, reduce costs, or both.
By using the software to look for patterns in large amounts of data, companies can learn more about their customers and develop more effective marketing strategies, as well as increase sales and lower costs.
Data mining software is one of a number of analytical tools that aggregate and analyze data . It allows users to analyze data from different dimensions and angles, categorize it and summarize the identified relationships.
Technically, data mining is the process of finding correlations, or patterns , between dozens of fields in large relational databases.
However, for the process to work, data mining will depend on effective data collection and storage, as well as computer processing.
Real Data Mining Examples
Supermarkets are known users of data mining techniques . Many of them offer free loyalty cards to customers, which give them access to reduced prices that are not available to non-customers.
These cards make it easy for stores to track who is buying what, when they are buying it and at what price.
Stores can then use this data, after analyzing it, for various purposes, such as: offering discounts targeted to customers’ shopping habits, deciding when to put items on sale and when to sell them at full price.
Another hypothetical example would be:
A supermarket chain that uses the power of data mining software to analyze local shopping patterns . And through that finding that when men buy diapers on Thursdays and Saturdays, for example, they also tend to buy beer.
Further analysis shows that these shoppers typically do their weekly shopping on Saturdays and that on Thursdays they only buy a few items. So through that, the retailer can conclude that these customers buy the beer for the next weekend.
The supermarket chain can use this information in several ways in order to increase its sales. Like, for example, changing the beer display location to close to the diaper location.
That way, they could certainly count on selling the diapers at the price without discounts on Thursdays.
Data
Data is any facts, figures or text that can be processed by a computer. Today, organizations accumulate a vast and growing amount of data in different formats and different databases. That includes:
Operational or transactional data, such as: sales, costs, inventories, payroll and accounting;
Non-operating data such as: industry sales, forecast data and macro economic data;
Meta Data: Data about the data itself, such as the logical database design or data dictionary definitions.
The patterns , associations , or relationships between all this data can provide information. For example, analyzing point-of-sale and sales transaction data can yield information about when and which products are being sold.
The information can be converted into knowledge about historical patterns and future trends. For example, summary information about supermarket retail sales can be analyzed in light of promotional efforts to provide insight into consumer buying behavior.
Thus, a manufacturer or retailer can determine which items are most susceptible to promotional efforts.
Data Storage
Dramatic advances in data capture, processing power, transmission and storage capabilities are allowing organizations to integrate their diverse databases into “data warehouses”.
Data warehousing is defined as a centralized data management and retrieval process.
Data warehousing, like data mining , is a relatively new term, although the concept has been around for years. Data warehousing represents an ideal vision of maintaining a central repository of all organizational data.
Data centralization is necessary to maximize user access and analysis. Impressive technological advances are making this vision a reality for many companies.
Furthermore, dramatic advances in data analysis software are allowing users to freely access this data. Data analysis software is what supports data mining