A Dimensional Modeling Primer with Mark Peco
Mark Peco, long-time TDWI faculty member and industry consultant, discusses the basics of dimensional modeling -- including a preview of his courses at TDWI Orlando.
- By Upside Staff
- September 22, 2023
In this recent episode of “Speaking of Data,” Mark Peco spoke about the ins and outs of dimensional modeling. Peco is a veteran consultant and TDWI faculty member. [Editor’s note: Speaker quotations have been edited for length and clarity.]
“At its most basic,” Peco began, “a data model is simply a way to structure and organize different kinds of data for different purposes.” Two of the most common types of data models are relational or dimensional (other types of models exist, such as entity-relationship models, but are less commonly used).
“Relational models are a generic way to organize data to serve a variety of purposes,” he said. “They are better suited to general-purpose solutions involving structured data -- for example, managing transactions or operational activities.”
“In contrast,” Peco went on, “dimensional models are oriented specifically around problems of measurement, analysis, and process improvement.” Although you could do all the things we do in dimensional models with relational models, he added, dimensional models are optimized to require much less effort to get at the necessary data. They make analysts more productive because they’re organized around the specific problems analysts are trying to solve, he said.
“Dimensional models are made up of two elements: facts and dimensions,” he explained. “A fact quantifies a property (e.g., a process cost or efficiency score) and is a measurement that can be captured at a point in time. It’s essentially just a number. A dimension provides the context for that number (e.g., when it was measured, who was the customer, what was the product).” It’s through combining facts and dimensions that we create information that can be used to answer business questions, especially those that relate to process improvement or business performance, Peco said.
Peco went on to say that one of the biggest challenges he sees with companies using dimensional models is with integrating the potentially huge number of models into one coherent picture of the business.
“A company has many, many processes,” he said, “and each requires its own dimensional model, so there has to be some way of joining these models together to give a complete picture of the organization.” Peco explained that Ralph Kimball, the original developer of dimensional models back in the 1990s, spoke about what he called “conformed dimensions” -- dimensions that used across several models, things such as a calendar, a product hierarchy, or a list of sales territories. By aligning different sets of facts with these same dimensions, Peco said, companies can get a more complete picture of how things are going.
However, he added, it takes discipline for project teams to follow existing standards to use these conformed dimensions as they build new models. Otherwise, he said, they will wind up building what are called “independent models,” which can’t be used for analysis across the organization, defeating the purpose of building the model in the first place.
When dimensional models first arose, responsibility for maintaining this discipline would have fallen to the IT department, Peco said. However, with recent moves toward business ownership of data and analytics, he believes it will fall more on chief executives (CDOs, CAOs, etc.), especially as the importance of data governance to modeling initiatives increases.
“People think that data modeling is strictly a technical design process,” he said, “but that’s only part of the puzzle. It’s just as important to clearly define business question the model is meant to answer, the purpose and scope of the model, the data requirements, and so on.” In the course he will be teaching at TDWI Orlando this fall, Peco will take students through this process step by step.
“We’ll go through the life cycle of a dimensional model on paper, from defining the problem to different levels of design,” he said. “Then we’ll have a critique session and show everyone how things fit together in terms of the overall life cycle of the model.”
[Editor's note: You can listen to the entire conversation on demand here.]