Evaluate performance using a graph database and FPGAs
Every organization thinks that online response times should be faster and batch elapsed times should be shorter. Performance expectations, no matter how unrealistic or even ridiculous, exert a lot of pressure on IT management to work miracles.
Graph databases and Field Programmable Gate Arrays (FPGA) can dramatically increase application performance by multiple orders of magnitude to respond to these high expectations and ever more demanding systems.
Challenging information technology trends
Many CIOs are expected to respond to these challenging trends:
- Exploding data volumes.
- Increasing number of active end-users.
- Longer average end-user sessions.
- Expanding number of applications.
Reading Time: 5 minutes
|NOT YET A PREMIUM MEMBER?|
|Background on …|
Also, many organizations are pursuing major information technology initiatives like:
- Data analytics and data visualization that consume significant computing resources.
- Digital transformation that increases the number of applications and improves the integration across applications.
- Data warehouses and data lakes that consume significant storage.
- IIoT that creates vast volumes of time series data.
- Artificial intelligence (AI) and machine learning (ML) that exhibit a voracious appetite for data and computing resources.
- The data-driven organization concept that requires lots of data integration among diverse data sources.
No amount of upgrading, tuning, and optimizing the computing environment can keep up with this growth in the consumption of computing resources. Moving applications to the cloud can help significantly, but only up to a point.
Graph databases and FPGAs have emerged as effective information technology CIOs can implement to respond to these demanding trends. Let’s explore how they help.
To read an overview of graph theory, its applications, mathematics, and history, click here.
Dr. Victor Lee, the Head of Product Strategy and Developer Relations at TigerGraph, a leading graph database software package vendor, presented at the recent Graph+AI World conference. He said, “Graph databases, like TigerGraph, offer significant advantages for applications that must quickly process large volumes of data that exhibits considerable connectedness among multiple entities in the database schema.”
The following features of graph databases contribute to successful applications that must manage huge data volumes and still deliver excellent performance at a scale that relational databases cannot handle.
Fast query speed
Graph databases deliver speedy query response times because queries only process the relevant relationships and not the total data volume in the database.
Graph databases routinely reduce query completion times by one to two orders of magnitude compared to the same application running on a relational database. As the number of entity instances increases, the difference in performance grows further.
This speed is essential to successfully operating the data analytics-oriented applications listed below. A good example application is financial fraud detection, which frequently requires querying millions of accounts and billions of transactions.
An important caveat about query speed is that only a native graph database can achieve fast query response times. Some graph database software packages are only wrappers running on top of a relational database. These solutions can only run as fast as the underlying relational database.
Entity relationships stored as data
Graph databases explicitly store the relationships among entities as data alongside the attribute data. This simple sentence encapsulates a massive difference between relational databases and graph databases. Conversely, relational databases determine relationships by performing more expensive and time-consuming joins.
This relationship storage in graph databases:
- Makes the database schema much easier for software developers and business analysts to understand.
- Results in super-fast queries, even for complex queries or large data volumes.
A good example application is supply chain management, which requires representing the complex relationships among the thousands of components and parts suppliers associated with aircraft or automobile manufacturing.
Entity relationships easy to understand
Whenever a DBMS can represent real-world relationships accurately and avoid kluges or workarounds such as cross-reference tables or composite keys, it’s easier for software developers to understand the organization of the data in the database. That ease of understanding leads to the following:
- More accurate, reliable solutions with less development effort.
- Reduced effort and elapsed time to implement future enhancements.
A good example application is computing infrastructure problem analysis. This application must represent the many components of a complex computing environment in the database schema in an easy-to-understand way.
Data structures responsive to change
Whenever a DBMS can accurately represent real-world data structures, more of the benefits listed under Entity relationships above can be realized.
In graph databases, data structures are more flexible, and multiple data types are more easily combined. While data is still organized in tables, these table definitions and their relationship definitions can be altered dynamically.
These graph database capabilities are significant when the application data includes many data types. A good example application is Facebook comments or posts with any combination of text, images, videos, links, and geographic coordinates.
Field Programmable Gate Arrays (FPGA)
Kumar Deepak, a Distinguished Engineer at Xilinx, a leading vendor of FPGA hardware and related software, presented at the recent Graph+AI World conference. He said, “Our graph database customers experience significant performance gains when they add Xilinx FPGAs to their computing infrastructure.”
The following features of FPGAs deliver excellent performance for graph database applications that operate with data volumes at a scale that even a multi-CPU server cluster cannot manage.
There is a limit to what adding more CPUs to a server can achieve. Each additional CPU produces a smaller performance increment because of limitations described by Amdahl’s law and limited memory-to-CPU bandwidth.
FPGAs address this scalability limitation of CPUs by pipelining parallel computations and offering much higher memory-to-CPU bandwidth with low latency.
This FPGA capability significantly raises the:
- Data volumes that a graph database application can process while still delivering excellent performance.
- Number of concurrent tasks that a graph database can process.
Fast execution speed
The sequential instruction processing architecture of general-purpose server CPUs is designed to handle widely varying workloads. This versatility limits their execution speed. Increasing the clock speed helps, but other constraints limit that approach.
FPGAs address this speed limitation of CPUs by offering a massively parallel processing architecture that performs a focused number of functions extremely fast. Further, the parallel processing elements of FPGAs are typically pipelined to process even more data per FPGA clock cycle than CPUs. Parallel processing and pipelining can apply to any of the following situations:
- Instructions – perform multiple instructions at the same time.
- Tasks – perform different functions on a single set of data simultaneously.
- Data – perform the same instruction for different blocks of data simultaneously.
This FPGA capability significantly reduces the query completion times by one to two orders of magnitude compared to the same application running without the FPGA.
Architected for algorithms
Server CPUs are architected for the instructions associated with transaction processing. That’s the right choice for many applications but not the ones listed below.
By contrast, FPGAs can be architected for graph algorithms by configuring them with just the instructions associated with graph algorithms. To support the effective use of FPGAs, graph database software package vendors include or license software routines that perform many of the frequently used graph algorithms.
This FPGA capability significantly raises the complexity of algorithms that a graph database application can process while still delivering excellent performance.
For a short explanation of graph algorithms, click here. For a list of algorithms, including graph algorithms, click here. For an expanded list of graph algorithms, click here.
The applications where graph databases and FPGAs provide significant performance, software development, and manageability improvements over relational databases include:
- Fraud detection, money laundering.
- Supply chain optimization.
- Customer 360 interaction analysis.
- Product recommendations.
- Bioinformatics, drug discovery.
- Social network monitoring.
- Risk management.
- Identity and access management.
- Computing infrastructure monitoring.
If you need faster application performance, evaluate using a graph database and FPGAs.
Yogi Schulz has over 40 years of information technology experience in various industries. Yogi works extensively in the petroleum industry. He manages projects that arise from changes in business requirements, the need to leverage technology opportunities, and mergers. His specialties include IT strategy, web strategy and project management.
For interview requests, click here.
The opinions expressed by our columnists and contributors are theirs alone and do not inherently or expressly reflect the views of our publication.
© Troy Media
Troy Media is an editorial content provider to media outlets and its own hosted community news outlets across Canada.