In our Data as a Product blog series, the earlier blog on Data Architecture Principles for Management Blueprint covered the fundamentals of data architecture – discipline that involves documenting an organization’s data assets, mapping how data flows through its systems, and providing a blueprint for managing data. It creates a multilayer framework for data platforms and management tools, as well as specifications and standards for collecting, integrating, transforming, and storing data.
Data architecture is crucial in supporting operational applications, defining the underlying data environment for business intelligence (BI) and advanced analytics initiatives, and creating effective data governance and internal data standards. A well-designed data architecture also aids in developing effective data analytics platforms that deliver useful information and insights, improves strategic planning, and operational decision-making, among other benefits.
Now let’s understand how data architecture goes hand in hand with data modeling, which creates diagrams of data structures, business rules and relationships between data elements. We’ll also look at the associated frameworks and associated roles that come with data architecture.
Data architecture vs. Data modeling
Data modeling focuses on the details of specific data assets. It creates a visual representation of data entities, their attributes, and how different entities relate to each other. This helps in scoping the data requirements for applications and systems and then designing database structures for the data, a process that’s done through a progression of conceptual, logical and physical data models.
Data architecture takes a more global view of an organization’s data to create a framework for data management and usage. Data models are a crucial element in data architectures, and an established data architecture simplifies data modeling. Below are few recommendations for data modeling:
- Gather both business and data requirements upfront, before building models
- Develop data models iteratively and incrementally to make the process manageable
- Use data models as a tool for communicating with business users about their needs
- Manage data models just like any other type of application code
Data architecture vs. Information and Enterprise architecture
Difference between data architecture and information architecture in enterprise applications is ‘information is data in context’. An information architecture defines the context that an enterprise uses for its business operations and management. A data architecture that delivers high-quality, reliable data is the foundation for the information architecture.
Meanwhile, data architecture is commonly viewed as a subset of enterprise architecture (EA), which aims to create an organizational blueprint for an organization in four domains or more. EA also encompasses the following:
What data architecture frameworks are available?
Organizations can use standardized frameworks to design and implement data architectures instead of starting completely from scratch. These are three well-known framework options:
DAMA-DMBOK2 – The DAMA Guide to the Data Management Body of Knowledge, is a data management framework and reference guide created by DAMA International, a professional association for data managers. Now in its second edition and commonly known as DAMA-DMBOK2, the framework addresses data architecture along with other data management disciplines. The first edition was published in 2009, and the second one became available in 2017.
TOGAF – Created in 1995 and updated several times since then, TOGAF is an enterprise architecture framework and methodology that includes a section on data architecture design and roadmap development. It was developed by The Open Group, and TOGAF initially stood for The Open Group Architecture Framework. But it’s now referred to simply as the TOGAF standard.
The Zachman Framework – This is an ontology framework that uses a 6-x-6 matrix of rows and columns to describe an enterprise architecture, including data elements. It doesn’t include an implementation methodology; instead, it’s meant to serve as the basis for an architecture. The framework was originally developed in 1987 by John Zachman, an IBM executive who retired from the company in 1990 and founded a consulting firm called Zachman International.
Key steps for creating a data architecture
Data management teams must work closely with business executives and other end users to develop a data architecture. If they don’t, it may not be in tune with business strategies and data requirements. Engaging with senior executives to get their support and meeting with users to understand their data needs are two of the nine data architecture planning steps.
Developing a full-scale enterprise data architecture starts with several important steps that data architects must follow when devising a solid data architecture plan:
1. Socialize with senior leaders – As with any strategic technology initiative, the value of developing a data architecture must be effectively articulated and continually communicated to C-suite executives. Craft a message that demonstrates the benefits a data architecture brings to the enterprise. Identify and engage key stakeholders to gain their support.
2. Identify the data personas – An organization’s technology environment is driven by the information needs of data consumers. Application system custodians are accountable for the data sets their applications produce and use. Ascertain the people who create, store, update, read and otherwise touch data within the enterprise. Identify stereotypical personas and characterize them according to their data touch points.
3. Determine information requirements – Engage the data consumers to understand their business strategy and solicit their business requirements for data. Document how those requirements relate to the abstract data domains, such as ‘customer’ or ‘product’ data, and the discrete data sets these consumers currently use or anticipate needing.
4. Evaluate information risks – Identify and interpret data governance directives and how they relate to the handling, management and protection of data.
5. Assess the data landscape – Survey and document the name, location, owner, producer, consumers and contents of enterprise data sets. Classify each data set according to usage scenarios and sensitivity and collect this information in a data catalog.
6. Analyze the data lifecycles – Evaluate how data sets flow from their origination points to their final destinations. Document the data lineage mapping of data pipelines.
7. Appraise the data infrastructure – Document the current state of data management in the enterprise and capture the current technology infrastructure – what systems, database structures, data warehouses, data marts and operational data stores are used, whether they’re on premises or in the cloud and, if the latter, the cloud service providers.
8. Do a SWOT analysis – Synthesize the knowledge that has been collected and analyze the strengths, weaknesses, opportunities and threats. Identify the greatest opportunities for improvement.
9. Create a blueprint and roadmap – Devise a blueprint for framing the enterprise data architecture that summarizes the collected knowledge and highlights proposed deployment projects. Scope out a roadmap for the proposed projects across the near-term, medium-term and longer-term horizons.
In addition to the 9 steps mentioned above, we recommend organizations do the following:
What are the different roles in data architecture design and development?
The lead role in data architecture initiatives typically goes to data architects. They need a variety of technical skills, as well as the ability to interact and communicate with business users. A data architect spends a lot of time working with end users to document business processes and existing data usage, as well as new data requirements.
On the technical side, data architects create data models themselves and supervise modeling work by others. They also build data architecture blueprints, data flow diagrams and other artifacts. Other duties may involve outlining data integration processes and overseeing the development of data definitions, business glossaries and data catalogs. In some organizations, data architects are also responsible for designing data platforms and evaluating and selecting technologies.
Other data management professionals who often are involved in the data architecture process include the following: