20 Nov 2024

Empowering Analytics with Microsoft Fabric – Our Evolution of Velox

At Coeo, we help our clients solve their data challenges with robust analytics solutions. Over time, we observed recurring patterns in the challenges our customers faced and the solutions we implemented. This inspired us to create Velox—our solution accelerator designed to help customers overcome these common obstacles more efficiently. Velox is built on a data lakehouse architecture, using cloud-based data storage and analytics tools. 

In its current architecture, Velox integrates the following tools: 

  • Azure Data Lake Storage Gen2 for secure, scalable data storage 
  • Azure Data Factory to streamline data ingestion 
  • Databricks to enable data transformation and modelling within the lakehouse 
  • Azure Synapse Analytics to deliver refined, ready-to-use data for analytics 
  • Power BI for insightful data visualization, reporting, and self-service analytics 

Each step of this process is data-driven. Using an Azure SQL database for configuration, our customers can easily ingest new data sources or update data warehouse tables by simply updating the configuration database—keeping the entire process adaptable and efficient. 

Velox connects with our managed services allowing us to monitor all lakehouse operations, aggregate logs from various Azure services, and present them in a user-friendly portal, where we can proactively address any potential issues. 

Figure 1 – A Velox Lakehouse Architecture

Adapting Velox for Microsoft Fabric

The arrival of Microsoft Fabric has opened new opportunities for Velox, allowing us to enhance our architecture by migrating it into the Fabric ecosystem. Fabric brings together several tools within one platform, streamlining both the integration process and ongoing management. Here’s how each component of Velox translates into Fabric:

  • Azure Data Lake Storage Gen2OneLake
  • Azure Data FactoryFabric Data Factory
  • DatabricksFabric Notebooks and Spark
  • Azure Synapse AnalyticsLakehouses and Warehouses
  • Power BI remains integral to our analytics workflow

When migrating, we identified two key areas that initially seemed challenging: configuration data management and logging.

Figure 2 – Velox Architecture in Fabric

Configuration Database in Fabric

Our configuration database enables clients to make updates to their analytics solutions quickly and easily. However, Fabric initially lacked a direct equivalent for this functionality. While Lakehouses and Warehouses were options, neither was a perfect match. Warehouses are ideal for large-scale analytical queries, not the smaller, transactional queries typical in configuration databases. Lakehouses, on the other hand, require data manipulation through notebooks rather than T-SQL, which could introduce a learning curve for some users.

Fortunately, the recent introduction of Fabric Databases has provided the ideal solution. Built on the familiar SQL Server engine, Fabric Databases bring transaction-friendly workloads—such as those needed for configuration data—into the Fabric environment. resulting in an intuitive experience for anyone familiar with SQL Server, combined with Fabric’s enhanced security and cloud-based benefits like integrated cloud authentication and encryption.

Fabric Databases have also expanded our use cases within Fabric. For instance, we’re working with a client who is centralising data in OneLake, using it for analytics and application development. Fabric now allows them to manage both workloads within a single environment, eliminating the need for additional Azure resources.

Logging and Monitoring

For our Velox logging solution, we rely on APIs and log analytics to collect operational data. Initially, Fabric presented partial solutions in this area. For example, while Spark logs could be directed to log analytics, Fabric Data Factory pipeline logs weren’t directly accessible. To bridge this gap, we experimented with running pipelines via REST APIs in notebooks, gathering logs into Spark and using Fabric’s real-time analytics for proactive monitoring—a topic we can explore in more detail in a future post.

The introduction of Workspace Monitoring in Fabric has streamlined this process. Workspace Monitoring offers detailed logging information and metrics on Fabric workspace activity, empowering admins and developers to pinpoint issues, troubleshoot performance, and monitor capacity usage. Additionally, the data is stored in a read-only Fabric Eventhouse KQL database, which allows for pattern and anomaly analysis through familiar query methods.

Future Possibilities with Fabric

With new features being added to Microsoft Fabric monthly, any gaps in functionality are continually being narrowed, and we’re excited to see these enhancements rollout. As Microsoft Fabric evolves, we at Coeo are excited to use these new capabilities to deliver even greater value to our customers. Whether you’re considering Fabric or have already started your journey and run into challenges, we’d love to help you explore the possibilities.

Download our Microsoft Fabric in 5 Days flyer to find out more about how you can get started.