NAV Data Lake

What is the NAV Data Lake?

The NAV Data Lake (also known as the Tax Authority Data Lake) is an extensive digital data repository developed by the Hungarian National Tax and Customs Administration (NAV). It is designed to collect, structure, and analyze various types of taxation-related data received from economic actors — including businesses, private individuals, and financial institutions. Its purpose is to provide NAV with a centralized, interconnected data environment that enables more effective audits, risk assessments, and automated processing.

The data lake includes not only raw data from mandatory reporting obligations, but also processed and derived information.

What kind of data is stored in the NAV Data Lake?

The NAV Data Lake contains data from diverse sources and in various formats, used for purposes such as risk analysis, audits, and statistical evaluation. Key data categories include:

  • Online Invoice system: detailed data on issued and received invoices,
  • Online and eCash Register systems: real-time sales and receipt data,
  • eReceipt (eNyugta) system reports,
  • Import-related VAT data,
  • Tax account data, including payments and outstanding debts,
  • Tax return information (e.g. VAT, corporate tax, personal income tax),

Taxpayer classifications, such as:

  • public debt-free taxpayers,
  • trusted or high-risk taxpayers,
  • entities employing undeclared workers.

Together, these datasets give NAV a comprehensive picture of taxpayers’ economic activity and behavior.

How does the Data Lake differ from a traditional database?

Unlike traditional databases, the NAV Data Lake integrates data of different formats and from multiple sources. This allows for:

  • Multidimensional data analysis, such as comparing invoice, receipt, and tax return data for a given business,
  • Real-time or near-real-time data processing,
  • The use of AI-based risk assessment models.
  • The data lake is structured around specific themes (e.g. VAT reporting), linking various types of information across sources.

How does NAV use the Data Lake?

NAV uses the Data Lake for inspection, analysis, and automation purposes. Key use cases include:

Risk assessment: identifying whether a business is a reliable taxpayer and whether it should be subject to targeted audits,

Automated cross-checks: reconciling invoice, receipt, and tax return data to identify discrepancies or irregularities,

Digital auditing: replacing traditional on-site inspections with digital reviews based on available data,

Taxpayer classification: using the data to assign reliability ratings (e.g. “trusted” or “high-risk”).

Can taxpayers access the data stored in the Data Lake?

Taxpayers do not currently have direct, full access to the internal structure of the Data Lake. However, selected data subsets can be queried via:

  • The Online Invoice system interface,
  • The NAV eBEV portal, for returns and tax account balances,
  • The eReceipt (Nyugtatár) and eInvoice (eSzámla) systems, for receipt and invoice records.

In the future, especially with the integration of artificial intelligence and API-based platforms, access to additional datasets may expand through machine-to-machine (M2M) interfaces.

Summary

The NAV Data Lake is a core element of Hungary’s digital tax administration infrastructure. It collects and connects a wide range of data from various economic actors and reporting systems. The platform enables real-time oversight, targeted risk analysis, and administrative automation. Its operation significantly transforms how NAV conducts audits and classifies taxpayers, making its processes more efficient and predictive.

Official definition

The NAV Data Lake is a large-scale database managed by the Hungarian National Tax and Customs Administration, containing data submitted through taxpayer declarations and reports, as well as processed and derived information. It stores multiple types of data in a unified structure, organized by thematic areas.

Related Services

Accounting and Tax automation

The expertise and relevant tools of our tax technology team can help you ensure compliance with the applicable legal provisions while keeping your existing software.

NAV Online Invoice audit simulation

Our automated solution can be an advantageous alternative to manual subsequent checks performed on the NAV Online Invoice portal.

NAV Online Invoice administration

Our automated solution can be an advantageous alternative to manual subsequent data reporting performed on the NAV Online Invoice portal.

Generating XML documents

Producing invoice XML and other eDocuments Tax authorities expect companies to report data in electronic documents of different types and formats, either without delay, when issuing invoices (invoice XML), or at a later time, when filing their regular tax returns (SAF-T) or during occasional tax audits (XML data export). A large number of different electronic […]

Back to the glossary