Business Intelligence and Data Analysis

Durga Acharya
10 min readJun 14, 2022

--

Business needs to collect a huge number of data to grow their business. With the help of those data, businesses analyze it and make appropriate decisions. The decision sometimes is strategic and sometimes it is operational. For example, if a company is launching a new product that is related to the manufacturing company, then a company needs to collect the data on how many manufacturing companies are there in the target location, which company is using the what product, and so on so that the company can make the best marketing strategy. When data continues to increase, the demand for business analytics is also increased (Wake Forest University, 2022). Therefore, the below paragraphs will explain what are business and data analytics, and how a business can take advantage of them in the real world.

Data and data analytics is a process of transforming data into insights to improve business decisions (Wake Forest University, 2022). However, Oracle (2022) has differentiated business and data analytics. While data analytics are referring to the process of collecting a large number of data, getting the insights, and predict for the future, but the business analytics is the process of taking that data and putting it in the prebuilt business content and tools which accelerate the analytics work (Oracle, 2022).

There are several steps of data analysis. The first is collecting the data from different sources, such as social media, apps, google, devices, and so on, and storing those data in the database. The second step is data mining, which makes those raw data into a usable format. The third step is to do a descriptive analysis following the predictive. The last step is visualizing and reporting the data. All the steps will be discussed later in the following paragraphs.

In conclusion, business and data analytics is the process of collecting historical data and analyzing it to get insight such as trends, patterns, and the root causes of it. The analytics process then will lead to the making of the data-driven business decision based on that data analyzed.

Business Intelligence (BI) and data analytics is brought major attention to the business. A business that has invested in developing the BI tools has achieved competitive advantages over the competitors. The tools and decision support system do not do anything themselves, but the company has to make a decision based on the recommendation, and the importance of data analytics depends on the industry, its nature, internal and external environment, company’s business model, and competitors’ actions.

As it is explained earlier, the importance of data analysis and BI varies, and it is affected by different characteristics. But, the more complex and information-intensive and industry is, the greater the strategic importance of BI and the greater the opportunity for competitive differentiation (Williams, 2016). Therefore, the importance is varied from organization to organization, but in general, we use data analysis for the production, targeted content, and efficient operation of the business. To explain it more clearly, the business organization uses the data analysis tools to decide on what is the next product that they are going to lunch based on the information that their consumers want.

Furthermore, the business used the BI tools to make the marketing campaigns and the pricing strategies, and the data collected with the social media, and other platforms on what the customers want plays a vital role to make the marketing content to the targeted customers. At last, a BI and data analysis are significant for the efficient operation of the business. For example, the current company where I am working does not have specific guidelines and the way how we are ordering the materials for production. However, after I joined that organization, I collect the data on how much we are using per month, and what is the lead time for the vendor, and make automation, which suggests me buy beforehand before we are running out. Therefore, the data analysis always does not link to the customer, but it can help the business to run efficient operations as well.

Data is the information that is collected from various sources. Data can be integers, characters, or any other type which has a value. Data itself does not represent anything and does not have a value without processing. Processing is the process of cleaning data and sorting it into a usable format, which can later be interpreted, predicted and perceived, and visualized. Business organizations, as explained in the previous section, can use it to make strategic or operational decisions such as marketing, personalizing the product, and increasing the wealth of the organization. Hannila et al. (2022) explained that Facts based on company data assets are essential for decision-making instead of “gut feeling” and emotions. The utilization of the unused potential of data assets is promoted in the transformation toward data-driven product portfolio management.

The raw data has potential that a company can analyze and grow with the help of outcomes. “Analysis” is referring to analyzing, cleaning, and modeling the data into the format from which a conclusion can be withdrawn. The company can collect the raw data, process it, and use it to make the decision, therefore, data analysis cannot be done with the data.

There are various sources of data. Collecting the data from the right sources depends on the purpose of why one is collecting it. The sources of the data can be classified into two types; statistical and non-statistical. Statistical are sources where the data is being collected for official purposes, censuses, and other administrative surveys while non-statistical is the data source collected from a survey performed by private sectors. But, in terms of the collection method, we can collect the data from internal and external sources. For example, if we are collecting the data from our own historical database, it is consider as internal, however, if we

The data can be collected from internal and external sources. The internal source is the data that the company itself has. For example, the historical use of raw materials, the profit of the last decade, and so on. Those data are stored in various mediums within an organization. In most cases, the data is presented in the annual and periodical reports.

On the other hand, external data is collected from outside of the organization. Organizations can collect information from the market, its competitors, social media, and so on. For example, a company can collect the data of people, their income level, race and ethnicity, citizenship, veteran status, and so on Niagara Falls from the Datausa (2022) so that they can predict which product to lunch to the target location or can run the marketing campaign based on those data.

We are seeing the data wherever we go. The data can be structured or unstructured, which was already explained. Until we make data usable, it is raw data. The daily data we can see is, for instance, weather records, stock market, photo albums, musical playlists, or even Instagram accounts. Cuesta & Kumar (2016) agree that data can be seen as the essential raw materials for any kind of human activity. Further, they have illustrated that the identical data will have two natures; either categorical or numerical.

Picture 1: Nature of data

Sources: Cuesta & Kumar (2016, pp. 12)

Categorical data are the information that can be sorted into categories or groups. This nature has two sub-nature which are the nominal and ordinal nature of data. The nominal data has no fundamental ordering to its categories. For instance, a vehicle has two categories, either lease or fully paid. However, the ordinal data has the ordering categories. For example, the temperature, which we can categorize as high, medium, and low.

On the other hand, numeric data are the values that can be measured. The numerical data has two categories, one is discrete which we can count separately. For example, several houses in Fremont city. The second one is continuous data which has any value. The obvious example of this type is the inflation rate over the last 100 years. In this example, the percentage of inflation might be any value between finite and infinite intervals.

Several metrics make data analytics-ready. This means, that in order to make data ready for analytics, the data should be reliable, and data should have accuracy, accessibility, security, and data richness. Sebastian-Coleman (2012) has discussed that the common data metrics that make data ready for analysis are completeness, timeliness, validity, consistency, and integrity.

Picture 2: metrics of data

Source: Sebastian-Coleman (2012)

Data completeness is the wholeness of data. In order for data to be completed, there should not have any gaps or missing information. Whenever there is incomplete information, the data cannot be used for analysis purposes, and if used, it misleads the outcome and makes confusion. For example, if a company is collecting the data of potential housing customers, and there is no information on the customer’s monthly income or even a credit score. The data cannot be analyzed because the monthly income is one of the main metrics that make customers either qualify or disqualify to buy a house.

Similarly, timeliness as referred to by its name is another quality that the data should be up-to-date. If the data is not timely updated, it could mislead the organization into taking a wrong decision, and it will affect the organization’s money, time, and even reputational damage. Furthermore, the validity of the data is how a measuring method is measuring the data for what is needed. For example, if we are trying to get the relationship between household income size with owning a house, then how we collect the data, store and how we are creating the variable depends on the regression line. If the research has high validity, it will give the result in the real properties, which can be used to make a decision.

Data consistency is the way to make data consistent between different applications and networks. While during the data analysis process, data move across the various program, the similarity in the format of the data help easier to analysis and reduce the time, money, and extra effort of mining it. However, if there is no consistency in data, it is hard to claim that the data is unformed across the network.

Finally, data integrity is something that covers all of the above such as data accuracy, completeness, consistency, and security. This means, that when there is data integrity, stored data in the database will be complete, consistent, and reliable even though it was stored for many years, and was accessed many times. Therefore, data integrity will be protecting the company from data leaks or loss from outside forces and malicious intent.

The data collected from various sources are raw and cannot be used without cleaning and mining. The raw data is usually dirty, misaligned, over-complex, and inaccurate (Chu et al. 2016). Therefore, data processing, cleaning, and mining are needed for this raw data to make it ready for analytics.

Data cleaning is the process of making data usable. This means, removing the not applicable, or unusable information that is deleted or stored in a different format and making data clean so that the computer program can read it and use it for analysis. The raw data might have incorrect, corrupted, incorrectly formatted, duplicate, or incomplete data, which are removed in the process of data processing. If a company used the raw data without processing, the result is often misleading.

One of the examples of making raw data clean in Excel is removing the #NA value. When we need a result that is divided by another cell, and the value becomes NA due to divided cell was zero, then excel gives us the NA because it cannot divide something by zero. In that case, NA will ruin our program or function, and the result will also be #NA. In that context, an analyst can remove the NA value in several ways, either by using the #ifna function or sorting the value and making it zero, or deleting it. Finally, our result will not be affected by the #NA value.

Data visualization is the process of making data summarized and visible. There are many ways of making data visual. For example, using charts, graphs, dashboards, and infographics is the common visualization method in use. The main objective of data visualization is to help the human brain to pick the insight from and identify the patterns.

The data visualization process gives an instant way of communication by using visual information. It is significant to each business organization because it helps the firm to get the factors that affect the customer behavior and the areas where the firm needs to improve. This also helps businesses on when and where to lunch the new product and can provide the trend and predict the sales revenue for the future. Besides, data visualization helps business organizations to make a decision fast, know what is the customer’s interest, distribute the information easily within the organization, and so on. The data visualization help business to get finding quickly and help to act on them so that business can succeed abruptly with minimum errors.

A practical example of data visualization is the table, bar graph, area chart, line graph, pie chart, bullet graph, infographics, and so on. For example, the United States Census Bureau (USCB, 2021) presented the data of the 2020 census on a map of the USA, which shows the percentage of changes in county population over the 10 years from 2010 to 2020, which is one of the decent examples of data visualization where people do understand how many percentages of the population is increased in particular county by looking at the map. With the help of data visualization tools, people understand the data easily.

Chu, X., Ilyas, I. F., Krishnan, S., & Wang, J. (2016, June). Data cleaning: Overview and emerging challenges. In Proceedings of the 2016 international conference on management of data (pp. 2201–2206).

Cuesta, H., & Kumar, S. (2016). Practical data analysis . Packt Publishing Ltd.

Data Usa (2022, May 14). Niagara Falls, NY. https://datausa.io/profile/geo/niagara-falls-ny/

Hannila, H., Silvola, R., Harkonen, J., & Haapasalo, H. (2019). Data-driven begins with DATA; potential of data assets. Journal of Computer Information Systems , (1), pp. 29–38. https://doi.org/10.1080/08874417.2019.1683782

Oracle (2022, May 14). What is Business Analytics? https://www.oracle.com/business-analytics/what-is-business-analytics/

Sebastian-Coleman, L. (2012). Measuring data quality for ongoing improvement: a data quality assessment framework . Newnes

United States Census Bureau (2021). 2020 Census: Percent Change in County Population: 2010 to 2020. US Department of Commerce. https://www.census.gov/library/visualizations/2021/dec/percent-change-county-population.html

Wake Forest University (2022, May 14). What is Business Analytic? https://business.wfu.edu/masters-in-business-analytics/articles/what-is-analytics/

Williams, S. (2016). Business intelligence strategy and big data analytics: a general management perspective.Morgan Kaufmann. https://www.doi.org/10.1016/b978-0-12-809198-2.00003-8

Originally published at https://www.durgaacharya.com.

--

--