Two in a box (if you can) and everyone in documentation (always).
In an IT context, “two in a box” refers to two servers or components that are designed to work together to provide redundancy and increased reliability. This setup can ensure that if one component fails, the other will take over its operations, thus maintaining the continuity of service. The goal of having “two in a box” is to provide high availability and disaster recovery. This also applies to human roles in an organization; however, it is rarely implemented.
Let’s look at a relevant Analytics example. We all likely know a person in our company or organization by name who is the “go-to” person for Analytics. They’re the ones who have reports or dashboards named after them – Mike’s Report or Jane’s Dashboard. Sure, there are other people who know analytics, but these are the true champions who seem to know how to get the hardest things done and overachieve on deadlines. The issue is that these people stand alone. In many cases under pressure, they don’t work with anyone as that might slow them down and this is where the problem begins. We never think that we are going to lose this person. I’ll refrain from the typical “let’s say they get hit by a bus” or using an example leveraging the current job market opportunities and say something positive like “they won the lottery!”, because we should all do our part to be positive these days.
The Story
Monday morning comes, and our analytics expert and champion MJ has submitted their resignation. MJ won the lottery and has already left the country without a care in the world. The team and people who know MJ are thrilled and jealous, yet work must go. Now is when the value and reality of what MJ was doing is about to be understood. MJ was responsible for the final publish and validation of the analytics. They always seemed to be able to improve efficiency or make that difficult change before supplying the analytics to everyone. No one really cared how it got done and was secure in the fact that it just happened, and MJ was an Analytics individual Rock Star so a level of autonomy was bestowed. Now as the team starts to pick up the pieces, the requests, the daily issues, the modification requests they are at a loss and begin to scramble. Reports / Dashboards are found in unknown states; some assets didn’t update over the weekend, and we don’t know why; people are asking what’s going on and when things will be fixed, edits that MJ said were done aren’t showing up and we have no idea why. The team looks bad. It’s a disaster and now we all hate MJ.
The lessons
There are some easy and obvious take-aways.
- Never allow an individual to work alone. Sounds good but in smaller agile teams, we don’t have time or the people to make this happen. People come and go, tasks are many, so it is divide and conquer in the name of productivity.
- Everyone must share their knowledge. Also sounds good but are we sharing with the right person or people? Keep in mind that many lottery winners are coworkers. Doing knowledge share sessions also takes time away from tasks and most people only invest in skills and knowledge just in time when it is needed.
So, what are some real solutions that everyone can be able to implement and get behind?
Let’s start with Configuration Management. We’ll use this as the umbrella term for several similar topics.
- Change Management: The process of planning, implementing, and controlling changes to software systems in a structured and systematic way. This process aims to ensure that changes are made in a controlled and efficient manner (with the ability to revert), with minimum disruption to the existing system and maximum benefit to the organization.
- Project Management: The planning, organization, and control of software development projects to ensure that they are completed on time, within budget, and to the desired quality standards. It involves the coordination of resources, activities, and tasks throughout the software development lifecycle to achieve the project objectives and deliver the software product on schedule.
- Continuous Integration and Continuous Delivery (CI/CD): The process of automating the building, testing, and deployment of software. Continuous Integration requires regularly merging code changes into a shared repository and running automated tests to detect errors early in the development process. Continuous Delivery/Deployment involves automatically releasing tested and validated code changes into production, allowing for rapid and frequent releases of new features and improvements.
- Version Control: The process of managing changes to source code and other software artifacts over time using specialized software tools. It allows developers to collaborate on a codebase, maintain a complete history of changes, and experiment with new features without affecting the main codebase.
All the above refer to good software development practices. Analytics that drive and run the business deserve no less as they are mission critical to decision making. All of analytics assets (ETL jobs, semantic definitions, metrics definitions, reports, dashboards, stories…etc) are just code snippets with a visual interface for designing and seemingly minor changes can reek havoc on operations.
Using Configuration Management covers us to keep running in a good state. Assets are versioned so we can see what has happened in their life span, we know who is working on what along with the progress made and timelines, and we know that the production will go on. What is not covered by any pure process is the transferring of knowledge and the understanding of why things are the way they are.
Every system, database, and analytics tool have their own quirks. Things that make them go fast or slow, items that make them behave a certain way or produce a desired result. These can be settings at a system or global level or things within the asset design that make them run just as they should. The problem is that most of these things are learned over time and there is not always a place to document them. Even as we move to Cloud systems where we no longer control how the application executes and we rely on the supplier to make it as fast as possible the tweaking of definitions continues within our assets to unlock exactly what we are looking for. This knowledge is what needs to be captured and shared by making it available to others. This knowledge has to be required as part of the documentation of assets and made an integral part of the version control & CI/CD check in and approval process and in some cases even as part of a checklist prior to publish of things to do and not do.
There is no magic answers or AI to cover up for shortcuts in our analytics processes or lack there of. Regardless of the size of the team that keep the data and analytics flowing an investment in a system to track changes, version all assets and help to document the development process and capture knowledge is a must. Investment in processes and time up front will save a ton of wasted time later figuring things out to maintain a healthy state of our analytics. Things happen and its best to have an insurance policy for MJs and other lottery winners.