What’s the RDB? Part 1

If you are planning to write reports over your Project Server data, you will need to know about the Reporting Database or RDB for short.  

A Little History

In Project Server 2003, all of the data was typically in one database.  Since this database was optimized to support Project Server transactions, reporting on that data could be a bit challenging.  Also, there were potential performance concerns as a long running report could impact the performance of core operations.

In Project Server 2007 and continuing in Project Server 2010, a separate Reporting Database is maintained.  This separation provides a database schema that is friendlier to report writers and it allows administrators to move the RDB to a separate database server if performance becomes a concern.

Project Server 2013 recombined the four databases into one physical database using four different schemas. This change makes Project Server 2013 easier to manage in a cloud infrastructure. If you upgrade from 2010 to 2013 and use the name of the reporting database as your database name, all of your reports will still work with no change.

In part 1, an overview will be presented and an overview of one key relationship within the RDB will be presented.

Ok, So What’s In It?

The Reporting Database is used as a staging area for OLAP Database creation and to provide project server data in an easier format to query.  It contains data on the following major entities:

  • Project Decision
  • Project
  • Task
  • Task by Day
  • Assignment
  • Assignment by Day
  • Resource
  • Resource by Day
  • Timesheet and Administrative Time
  • Issues
  • Risks
  • Deliverables
  • Workflow
  • Other Supporting entities

It’s a bonanza of data for you to use.  Here’s a simplified relationship diagram of the Reporting database core tables.  I’ll use this as the basis for posts going forward and show you how to get some nice reports out of this data.

simplified ERD.

For your convenience, I’m hosting an online interactive copy of the RDB Field Picker.that comes in the Project Server 2010 SDK.

The P-T-A-R Principle

One of the key reporting relationships in the Reporting Database is that between Project – Task – Assignment – Resource (PTAR).  This relationship constitutes the majority of Project Management centric reporting. 

Project

The Project entity contains the core information related to the Project plan.  There is one record for each project on Project Server.  This record includes:

  • Any Project level custom fields
  • Project rollup of work and costs
  • Project start and end dates
  • Project baselines

A complete Excel based field reference can be found in the Project Software Development Kit, found here: Download the Project 2010 SDK

The primary view for Project information in the RDB is MSP_EPMProject_UserView where ProjectUID is the primary identifier.  In the diagram above, all of the other tables that are linked to Project are related to that table by the ProjectUID value.  If you are writing queries which require Project information, your join should use the ProjectUID.

Task

The Task entity contains task level information for each task in every project.  The data is related to the task itself and not to a specific resource.  This record includes:

  • Start and End Dates
  • Baselines
  • Task Custom Fields
  • Work
  • Costs
  • Characteristics of the Task

Task characteristics are especially of interest when reporting over the data.  These tend to follow the naming convention of TaskIs… For example, if I wanted a Milestone only report for a given project, I would select all task records where projectuid is equal to the Project’s UID and TaskIsMilestone is equal to 1.  There are a number of these fields and I would suggest looking at the RDB Field Reference for more details.

The primary task view is MSP_EpmTask_UserView where TaskUID is the primary key.  Though TaskUID is a Globally Unique Identifier (GUID), when working with other related tables/views, I still do my joins using both ProjectUID and TaskUID.  The chances of duplicate GUID values are low but I don’t want to be that type of lucky with my project reporting.

Assignment

The Assignment entity contains information specific for a given resource on a given task.  Therefore, there will be a record for each resource for each task.  The primary view is MSP_EpmAssignment_UserView where AssignmentUID is the primary identifier. 

The Assignment record contains primarily:

  • Dates
  • Costs
  • Work
  • Baselines
  • Custom Field values where rolldown was enabled.

As a consequence, the Assignments table is one of the largest in the RDB and in Project Server as a whole.  As you need this data to derive the relationship between resources and the tasks to which they are assigned, play specific attention to query execution plans when creating new reports.

Resource

The Resource entity contains information related to a specific resource, be it a work, material, cost or budget resource.   There is one record for each resource created in Project Server.  The resource record contains:

  • Resource Custom Fields
  • Resource Characteristics
  • Login Account information
  • Organization information like Team, RBS
  • Cost rates
  • Timesheet Manager

Like the Tasks, the Resource view follows a similar naming convention of ResourceIs… to denote Resource characteristics.  For example, if I only want to see active Resources, I would filter my records based on ResourceIsActive = 1.  Especially for resources, you should add filtering on these fields to ensure you do not include team, generic and inactive resources in your reporting.

That covers the core aspects of these entities.  In future posts, I’ll go into greater details as to how best to query against them as well as cover other relationships within the RDB.

I need information, now what do I do?

If you’ve ever been asked for information to support work you are doing or perhaps, to keep people informed of your efforts, you may have struggled with the process to define and deliver the information needed.  If you identify with this issue, this post will cover three key questions you should answer to help set your requirements so that you can get to your information goal quickly.

What’s the Problem and Why Do I Care?

Our starting point is based on a quote from one of my favorite professors.  “What is the problem and why do I care?”  His point was that you have to be clear on the problem and what you are trying to address if you are going to be successful in formulating a solution.  Clearly understanding the problem will enable you to be effective in gathering the right information for your answer.  You should be able to state it clearly and easily. 

Some examples are:

  • What is the overall cost and schedule status of my project?
  • Are my people overbooked in the fourth quarter?
  • What factors are impacting my project’s delivery?

Otherwise, you are going to waste time pulling together data while looking for a question to answer. 

What are you doing with the information?

The second question relates to how the information will be used.  Essentially, the purpose will help drive the how to best structure the outcome.  Many people fail to identify this aspect correctly, resulting in information that is not structured properly to meet the need.  I’ll cover each of these outcome types in detail in subsequent posts.

Most likely, you are doing one or more of the following to either draw a conclusion or to illustrate a point.  While these are broken out as distinct entities for illustration purposes, in many cases you will be using a combination of techniques.

In this example, we are attempting to understand why sales are so dismal in the North region for November.

Aggregation

Aggregation

When we look at sales data by region or we break down the number of hours entered against a project by each person, we are doing Aggregation.  We may also create synthetic groups to aggregate data based on some attribute of the underlying data.  In many cases, simple number charts are used to convey the data. 

In the example above, we are looking at the November Sales numbers for the Blue, Green and Red Sales teams by region.  The Red team seemed to do well in November, at least according to this view.

Comparison

stackrank

Typically, if you have aggregated data, it is likely that you need to compare and rank the groups of data.  The Stack Rank is a very common scenario where you are ranking the data by Best to worst, based on some criteria.  Number charts, bar charts and to a lesser extent, pie charts are commonly used for comparisons. 

In this example, the West region actually had the most November sales of any region.  The Red Team, leader on the previous view, actually sold the least amount in the West Region. 

Here’s why the problem statement is important.  Without having a clear definition of the problem, it isn’t apparent which answer is correct, as different conclusions can be drawn from the same data.

Composition

composition

Another fairly common usage of information is to illustrate the composition of the data.  In this scenario, we may be attempting to determine which region has the most salespeople.  When used properly, a composition can be used to quickly convey relevant data.  We see that the East and North Regions are staffed with a smaller number of salespeople than the other regions.  This may give us a clue as to why sales are lagging in the North.

Trends

Timeseries

Another interesting information analysis you may choose to do is to understand how information changes over time.  This type of visualization allows you to understand the direction of progress, beyond the current state, enabling you to determine which items may be more worthy of your attention.  For example, projects that are late but are trending back to being on plan may be better off than a late project which is trending later. 

In this example, we see that sales in the North region are flat and actually beginning to decline.  In a real investigation, we would likely dig into this trend further since all other regions are growing.

Variation / Distribution

Distribution

Another way to visualize the data is to visualize how the data varies for a given period.  In many cases, the temptation is to only look at aggregated values or averages, but sometimes it’s the distribution of the data which tells a more compelling story.  Readers who have a statistical background will be very comfortable with this type of information as distributions, variances and other such items are core to statistical investigation.

In this case, we see the majority of deals for November are small deals, with a second peak.  This view would also provide a wider view of what’s happening.  Do we have a training issue?  Have vendors decided to cut back on orders due to the economy?  Are there other factors at play?  Without this view, these questions may not have been asked.

Relationship

Relationship

The last information type is to map out relationships.  If you are deriving information from people relationships, you might here the term “social graph”, which is one way to construct, visualize and consume relationship data..  Relationship maps may uncover potential dependencies between items like people, which are not normally covered by work management and financial management tools. 

In our example, one item jumps out of the data, in that the North region is covered out of one office.  As the other growing regions are covered out of multiple regions, there may be collaborations on how to approach a customer that aren’t happening in the Chicago office.  These collaborations may be resulting in more, small sales in the other offices.  Further research is warranted but you should consider relationship mapping as part of your information arsenal.

What behavior do you want to occur, as a result?

One last aspect to consider is what story should the Information you gather tell?  One way to determine the form of the story is to decide what behavior you want to occur as a result of the information you’ve gathered.  Targeting specific behavior helps you decide in which fashion the data will be presented.

For example, if the focal point of the your information is to ensure certain upcoming tasks are completed on time, you may determine that a stack rank of incomplete tasks, ranked by days until Due Date, is the best way to present the data. 

In the example used in this post, your intent may be to get another salesperson hired in the Atlanta office for the Blue team to support the North Region.  You are able to illustrate a sales decline, some potential reasons for it and you would have to present where investment could improve the situation.

Three aspects that can be used to determine your information requirements were covered in this post.  We’ll dive deeper into these and other relevant information in future posts.

TIP: Set the Default View In Project Pro

If you have a preferred view that you use in Project Professional that is not the Gantt view, it gets a bit tedious to change the view every single time you open the application. The easiest thing to do is to change the default view setting so that your preferred view is displayed when the application is opened.

To update the default view used:

  1. In Project Professional, click File, Options, General
  2. Under the Project View section, click the Default View dropdown and change the value to the desired view (highlighted below)
  3. Click OK to apply the change

image