Tech Thursday with SNOK: How to connect AI to on-premise SAP

Artificial intelligence in SAP is no longer a future prospect - it is the present. The problem is that most solutions assume a move to the cloud. But what about the thousands of Polish organisations that must, or wish to, remain on on-premise environments? The answer is simpler than it might seem - and does not require an infrastructure revolution.

When Anthropic published the Model Context Protocol specification in November 2024, few realised how fundamentally it would change the approach to integrating enterprise systems with language models. MCP solves a problem the industry had grappled with for years - every combination of an AI model and data source previously required a dedicated connector. With dozens of models and hundreds of source systems, this created an unimaginable web of dependencies that exceeded the ability of most organisations to maintain.

“For years we watched companies build point-to-point integrations between SAP and various analytical systems,” says Michal Korzen, CTO of SNOK and architect of AI solutions. “Each such integration was a separate project, a separate budget, a separate risk. MCP changes the rules of the game by introducing a universal language of communication between the world of AI and business systems.”

The MCP protocol is built on three components:

• MCP Host - the orchestrator that manages the overall communication

• MCP Client - the communication module embedded in the AI application

• MCP Server - a lightweight programme exposing data from external systems

Transport takes place via JSON-RPC 2.0 using either the HTTP protocol or standard input/output, ensuring bidirectional, real-time communication.

Significantly, MCP has already been adopted by the largest players in the market - OpenAI, Google DeepMind and Microsoft have all incorporated this protocol into their ecosystems. This means that investment in MCP-based infrastructure rests on solid foundations, with a long-term outlook for continued support.

SNOK, as one of the first SAP partners in Poland, has built its own MCP server dedicated to SAP systems. This solution allows AI models direct access to business data - from simple table reads, through calls to complex BAPI functions, to the automation of entire SAP GUI transactions. Unlike standard OData access, which is largely limited to read operations and simple modifications, SNOK’s MCP server exposes full business logic to the AI layer.

“We have been building this MCP server over the past few months, testing it in the real production environments of our clients,” adds Michał Korzeń. “We know where the pitfalls are, which BAPI functions behave unexpectedly under heavy load, and how to handle SAP sessions in the context of multiple parallel AI queries. This is knowledge that can only be gained through practice.”

Traditional methods of integrating SAP with external systems - RFC, BAPI and OData - remain the foundation of every integration architecture. However, each has its limitations, which take on particular significance in the context of working with language models.

RFC (Remote Function Call) - a binary, highly optimised protocol available since the days of SAP R/3. It offers the highest performance of all integration methods, but requires the SAP JCo library for Java or NCo for the .NET platform and does not natively support the HTTP protocol. For AI integration, an intermediary layer is therefore necessary to “wrap” RFC calls as a REST interface.

BAPI (Business Application Programming Interface) - function modules with built-in RFC support, which SAP designs with interface stability in mind. The vendor never changes the signature of an existing BAPI, instead creating a new version. This is a fundamental feature for long-term integrations - an interface built once will keep working for years without requiring modification.

OData - the preferred method for integration with language models. Native compatibility with HTTP and JSON, self-documenting metadata, and support for standard CRUD operations mean that AI models can communicate seamlessly with SAP systems. S/4HANA offers native support for OData v4 through the SAP Gateway. For older ECC 6.0 systems, installation of SAP NetWeaver Gateway version 7.4 or later is required as an additional component.

“OData is a great starting point, but on its own it is not enough to build a fully-fledged AI integration with SAP,” explains Jaroslaw Kamil Zdanowski, Partner at SNOK responsible for cybersecurity and SAP BASIS administration. “OData mainly provides read access to data and simple operations. The real value of AI in SAP emerges only when a model can call full business logic, automate processes and trigger actions within the system.”

In practice, the difference between OData access and full integration via MCP is fundamental. Imagine a scenario in which a user asks an AI assistant about the status of an order. With OData access, the model can only read data from the relevant table and return the information. With MCP integration, the same model can check the status, identify a delay, call a BAPI function to escalate the issue, create a notification for the responsible employee, and update the order’s priority - all within a single conversation with the user.

SNOK offers its own MCP server dedicated to SAP systems, which enables precisely such scenarios. The solution provides access to SAP tables, CDS views, the ability to call BAPI functions and function modules, and the automation of SAP GUI transactions. Importantly, it works both in a containerised on-premise environment and in a service-based model, retaining identical interfaces in both scenarios. A client can start with an on-premise deployment and, in future, should regulatory or business requirements change, move part of the workload to the cloud without rewriting the integration.

SAP Business Technology Platform serves as the central bridge connecting on-premise systems with AI services - regardless of whether those services run in the cloud or in the client’s own data centre. A key element of this architecture is the Cloud Connector - a proxy agent operating within the organisation’s local network.

Cloud Connector operates on a reverse-invoke model, meaning the connection is initiated from the on-premise environment rather than from outside. This is a fundamental difference from a security perspective - it eliminates the need to open ports in the firewall and drastically reduces the attack surface.

Data flow in the hybrid architecture proceeds as follows:

SAP ECC or S/4HANA system running on-premise
Cloud Connector - secure tunnel to the cloud
SAP BTP Connectivity Service - entry point to the platform
SAP Integration Suite - data transformation and routing
SAP AI Core + Generative AI Hub - processing by AI models

SAP AI Core is a runtime environment for AI and machine learning workloads, running on a managed Kubernetes cluster. It supports popular libraries such as TensorFlow, PyTorch and Scikit-learn, while offering automatic scaling and optional GPU support. Generative AI Hub, meanwhile, consolidates access to language models from various providers - Azure OpenAI with GPT-4 models, AWS Bedrock with Claude models from Anthropic, as well as open-source models such as Llama 3-70b or Mixtral-8x7b.

Particularly interesting is the ability to deploy your own models on the SAP platform. AI Core fully supports a “bring your own model” approach, enabling the deployment of custom Docker containers with inference servers - whether Ollama, vLLM or llama.cpp. This requires defining a serving template in YAML format and access to a Docker registry in the cloud.

A novelty introduced in 2024-2025 is the Edge Integration Cell - a component enabling local execution of integrations for scenarios with requirements around data residency, low latency or regulatory compliance. Deployment takes place on a Kubernetes cluster - whether in the public cloud or on local clusters.

“Edge Integration Cell is a breakthrough for organisations that need full control over data flow,” emphasises Michał Korzeń. “Previously, a hybrid SAP BTP architecture always meant that at least part of the processing had to take place in the cloud. Now we can build a fully on-premise AI solution while retaining all the benefits of the SAP integration platform.”

SAP Joule is a generative assistant introduced in September 2023, which now offers more than 1,900 skills and 300 AI scenarios. It supports navigation within SAP Fiori applications, execution of transactions such as orders or invoices, and access to documentation. It sounds promising, but the devil is in the detail.

Joule’s support status varies significantly depending on the system version. S/4HANA Cloud Public Edition is fully supported. S/4HANA Cloud Private Edition also received support starting with version 2023 FPS01. S/4HANA on-premise can use Joule indirectly via the BTP platform. ECC 6.0, meanwhile, requires the use of Joule Studio in conjunction with the BTP Destinations mechanism.

The key limitation for Polish organisations? The Polish language is not currently supported by Joule. Planned language extensions include German, Spanish, French and Portuguese, but there is no official information about the addition of Polish in the foreseeable future.

Since July 2025, Joule Studio has supported the SAP BTP Destination Service, enabling connections to the programming interfaces of on-premise systems without the need to replicate data to the cloud. This is an important step forward, but it still does not solve the language problem or eliminate dependency on SAP’s cloud infrastructure.

Joule uses a multi-model architecture with SAP’s own models - including a specialised ABAP model trained on 250 million lines of code in that language - as well as external models: GPT-4, Claude 3.5 Sonnet, and models from Google, Meta and Mistral AI.

“Joule is a great tool for organisations that are already in the SAP cloud and operate mainly in English,” assesses Jacek Bugajski, CEO of SNOK. “For the Polish market, where we have thousands of companies working on on-premise systems and needing support in the Polish language, we need different solutions. And those solutions exist - you just need to know how to implement them.”

In practice, organisations that decide to integrate AI with on-premise SAP gain a flexibility unavailable in the cloud model. They can choose the language model best suited to their needs - whether a global model from OpenAI, the European Mistral, or the Polish Bielik AI. They can control which data is available to the model and implement their own filtering and moderation mechanisms. Finally, they can scale infrastructure according to actual usage, without depending on the pricing of external providers.

SNOK has already delivered AI integration projects with SAP systems for clients in the healthcare, manufacturing and public administration sectors. Every implementation teaches us something new - about the specifics of different industries, the requirements of end users, and the challenges of integrating with existing infrastructure. This knowledge translates into smoother and more predictable projects for subsequent clients.

The partnership between SAP and NVIDIA, expanded at the NVIDIA GTC 2024 conference, opens up entirely new possibilities for organisations planning to deploy AI in an on-premise environment. The collaboration includes integration of the NVIDIA AI Enterprise platform with SAP Datasphere, access to NVIDIA NIM microservices within the SAP ecosystem, and the use of NVIDIA HGX H100 infrastructure to train custom models specialised for the SAP environment.

Jensen Huang, CEO of NVIDIA, described the data accumulated in SAP systems as a “gold mine” that can be transformed into personalised AI agents. This is not marketing exaggeration - in most organisations, SAP systems contain decades of transactional history, user behaviour patterns and domain knowledge encoded in customisations and extensions.

NVIDIA AI Enterprise is a comprehensive platform for running AI models in an on-premise environment. It includes:

• NVIDIA NIM - inference microservices for AI models

• NVIDIA NeMo - a library for building and training large language models

• Triton Inference Server - support for multiple machine learning libraries simultaneously • TensorRT-LLM - performance optimisation on GPUs

• NeMo Guardrails - security and content moderation mechanisms

Licensing for NVIDIA AI Enterprise costs 4,500 dollars per GPU annually under a subscription model, with the option of three- or five-year contracts. GPUs with a built-in licence - such as the H100 PCIe or H200 NVL - include a five-year AI Enterprise licence in the hardware purchase price.

Choosing the right GPU depends on the intended use case:

• H200 (141 GB HBM3e) - for the largest models exceeding 100 billion parameters | ~35,000-40,000 USD

• H100 SXM (80 GB HBM3) - for models of 70 billion+ parameters and training tasks | ~25,000-35,000 USD

• L40S (48 GB GDDR6) - for inference on models of 7-13 billion parameters | ~7,000-10,000 USD

• L4 (24 GB GDDR6) - for small and medium-sized models | ~2,000-4,000 USD

For most scenarios within an SAP environment, models in the 7-13 billion parameter range are sufficient, and these can be run efficiently on L40S or even L4 cards. There is no need for a server room with dozens of H100 cards to obtain real business value from AI.

NVIDIA NIM consists of containerised inference microservices with predefined, optimised models and a programming interface compatible with OpenAI. Deployment on any NVIDIA infrastructure takes literally a few minutes. Performance gains are impressive - Llama 3.1 8B achieves twice the throughput compared to a standard deployment, Llama 3 70B up to five times higher, and Mixtral 8x7B more than four times higher.

“The SAP-NVIDIA partnership is a signal to the market that AI in an on-premise environment is a fully-fledged development path, not a workaround or a stopgap,” comments Michał Korzeń. “SNOK, as a partner of both SAP and NVIDIA, can offer clients complete solutions - from hardware selection, through integration with SAP systems, to deployment and maintenance. It is not rocket science, but it does require experience in both worlds.”

The Polish language model Bielik is the first fully Polish alternative to foreign models, created by the SpeakLeash Foundation in cooperation with ACK Cyfronet AGH. The model won the Technology of the Year 2025 award granted by the Money.pl portal and made it into the top ten most influential open-source AI projects in the world according to the Spotlight AI 2025 ranking.

Various versions of the model are available, tailored to different hardware requirements and use cases:

• Bielik-11B-v2.2-Instruct (11 billion parameters) - the best conversational version

• Bielik-7B-Instruct-v0.1 (7 billion parameters) - the base conversational version

• Bielik-4.5B-v3 (4.5 billion parameters) - a balanced performance/requirements approach

• Bielik-1.5B-v3 (1.5 billion parameters) - a compact version for more modest hardware

Benchmark results confirm the model’s quality. On the open Polish LLM Leaderboard, the Bielik-11B-v2.2-Instruct model achieves a score of 65.57 points, outperforming the Meta-Llama-3.1-70B-Instruct model, which scores 65.49 - despite the latter being six times larger. In the Polish MT-Bench test, which assesses conversational ability, Bielik scores 8.11 points, beating GPT-3.5-turbo, which scores 7.87.

Bielik’s key advantages for Polish enterprises are fundamental:

• Native tokenisation of Polish text - the model requires fewer tokens to process Polish words, translating into lower costs and higher efficiency

• 100% Polish-language responses - other models often switch to English during longer conversations

• Understanding of cultural context - Polish idioms, sarcasm, and specific forms of humour

• Trained on 400 billion tokens of Polish text - a unique understanding of the language

Hardware requirements are surprisingly accessible:

• Bielik-11B (Q4_K_M) - ~8 GB RAM, runs even on a CPU

• Bielik-11B (FP16) - ~22 GB VRAM, A100-40GB or H100 cards

• Bielik-7B (Q4_K_M) - 8-16 GB RAM, RTX 3060+ or CPU

The Apache 2.0 licence allows full commercial use, modification and distribution without additional licensing fees. This is a fundamental difference compared to models that charge for every API call.

It is worth noting the practical scenarios for using Bielik in an SAP environment. An AI assistant can answer user questions about stock levels, order statuses or payment terms - all in natural Polish. It can generate reports and summaries based on data from the SAP system, translate SAP’s technical jargon into understandable business language, and even help new employees learn to operate the system. In more advanced scenarios, Bielik can analyse patterns in historical data, identify anomalies and suggest corrective actions.

Applications in the ABAP field - SAP’s programming language - are particularly interesting. Bielik can document existing code, generate unit tests for ABAP programmes, and even help create new reports and functions. For teams maintaining SAP systems with tens of thousands of lines of custom code, this represents a potentially enormous time saving.

“Bielik is a game-changer for the Polish SAP market,” says Jacek Bugajski. “We have a model that understands Polish better than foreign models many times its size, it can be run on your own infrastructure, and there is no need to pay for every query. For public institutions that, for regulatory reasons, cannot use the cloud, this is the only realistic path to AI.”

SNOK, as one of the first integrators in Poland, tested Bielik in real SAP environments. The results exceeded expectations - particularly in tasks requiring an understanding of Polish business and legal context. The model correctly interprets Polish abbreviations and industry terms, understands the specifics of the Polish tax and accounting system, and handles Polish proper names and date or number formats well.

The regulatory context in Poland clearly favours local solutions - particularly for the public sector and organisations operating in regulated industries. The National Cybersecurity System Act, together with the amendment implementing the NIS 2 directive, imposes specific obligations on essential and important entities.

Essential entities include organisations from the energy, transport, banking and healthcare sectors. Important entities include, among others, companies in the ICT, industrial manufacturing and chemical sectors. Both categories must implement a security management system, conduct risk assessments, report incidents within 24 hours (initial notification) and 72 hours (full report), and undergo audits every two years. Penalties for non-compliance can reach 10 million euros or 2% of annual turnover.

GDPR complicates the transfer of data to the cloud in several ways:

• Article 22 - the right to an explanation in the case of automated decisions (a direct link to AI scenarios)

• Article 28 - the requirement for a data processing agreement with cloud service providers

• Articles 44-50 - restrictions on transferring data to third countries

The US CLOUD Act creates a risk of access by US authorities to data processed by US providers - regardless of the physical location of the servers. Following the Schrems II ruling, standard contractual clauses require an additional risk analysis for every data transfer.

The AI Act - the regulation on artificial intelligence - is coming into force in stages. Since February 2025, a ban on high-risk AI practices has applied. From August 2025, provisions on general-purpose models take effect. Full application of all provisions begins from August 2026. The Polish draft law on AI systems, published in October 2024, provides for the establishment of a Commission for AI Development and Security.

“Regulations are not an obstacle - they are a signpost,” emphasises Jarosław Zdanowski. “Organisations that invest today in AI infrastructure compliant with KSC, GDPR and AI Act requirements will gain a competitive advantage once these regulations begin to be enforced. SNOK helps clients not only meet minimum requirements, but build an architecture that will be resilient to future regulatory changes.”

The SAP market in Poland comprises more than 1,500 organisations - including international corporations, large Polish enterprises and public administration units. According to the SAP Poland 2025 report, 38% of companies already use cloud-based ERP systems, and 39% plan implementation in the coming year. This means that more than 60% of the market still operates on on-premise systems - and a significant proportion of them intend to remain there.

A sidecar architecture, aligned with SAP Clean Core principles, is the most commonly recommended approach to integrating AI with on-premise SAP systems. In this model, AI workloads run outside the SAP core, communicating with it exclusively through official programming interfaces. An API or OData layer mediates between the SAP ECC or S/4HANA system and the AI platform, which in turn manages containers with language models and GPU units.

The main advantage of this architecture is the ability to scale AI components independently of the SAP system. Graphics cards can be added, models swapped, or inference infrastructure expanded without any interference with the ERP system’s core. This also simplifies licensing and maintenance matters.

An alternative approach is an embedded architecture, using SAP HANA Enterprise Edition. In this model, all AI components run within the SAP HANA application layer using a Cloud Foundry environment running locally. The built-in vector engine supports RAG scenarios - search augmented with context from a knowledge base.

For organisations planning implementation, security requirements are essential:

• Single sign-on (SSO) - integration with the SAP Identity Authentication Service eliminates the need to manage separate credentials for the AI layer

• Data masking - personally identifiable information is hidden before being sent to the language model

• Content filtering - protection against attempts to manipulate the model on input and output

• Full logging - all AI queries and responses are recorded for audit purposes

• TLS 1.3 encryption - securing all communication

• Network isolation - in a fully on-premise architecture, data never leaves the corporate network

“Security in AI-SAP integration is not an add-on feature - it is the foundation,” emphasises Jarosław Zdanowski. “We see too many organisations that, in the rush to deploy quickly, skip basic protection mechanisms. They are then surprised when an audit reveals non-compliance, or when the model starts answering questions it should not.”

In practice, SNOK applies a multi-layered approach to securing AI integration with SAP:

Layer 1: Access control The AI model never has greater privileges than the user asking the question. If an employee does not have access to financial data in SAP, the AI assistant will not be able to read or pass on that data either. This is achieved through integration with existing SAP authorisation mechanisms.

Layer 2: Sensitive data masking Before any data is sent to the language model, the system automatically identifies and hides national ID numbers, bank account numbers, email addresses and other personal data. The model receives masked data, and only in the response - if warranted by context - are the original values restored.

Layer 3: Content filtering Both user queries and model responses are analysed for attempts at manipulation, data leakage or the generation of inappropriate content. In a corporate SAP environment, it is particularly important to detect attempts to extract information about system structure, access credentials or sensitive business processes.

Layer 4: Full logging and auditability Every query and every response is recorded along with its context - who asked, when, from which system, and what data was made available to the model. This enables subsequent analysis, identification of irregularities, and compliance with regulatory requirements for AI system transparency.

Implementation recommendations vary significantly depending on the scale of the organisation and specific requirements.

For small and medium-sized enterprises: • A compact GPU unit (e.g. NVIDIA DGX Spark) • The Bielik-11B or Llama 3.1 8B model • Investment: 20,000-50,000 EUR (hardware + integration) • Predictable total cost of ownership - no API query fees

For large enterprises: • NVIDIA-certified servers with 4-8 H100 cards • NVIDIA AI Enterprise subscription: 18,000-36,000 EUR/year • SAP HANA Enterprise Edition with an AI layer • Hardware investment: 100,000-200,000 EUR + software

For the public sector (maximum sovereignty): • Bielik as the base model (Apache 2.0, Polish, legally sourced training data) • A fully on-premise architecture - zero cloud components • Compliance with KSC, GDPR and AI Act • Edge Integration Cell for SAP integration

“Every organisation is different, and there is no universal solution that works everywhere,” says Jacek Bugajski. “That is why at SNOK we start every project with an in-depth analysis of needs, existing infrastructure and regulatory requirements. Only then do we design the architecture and select the components. This approach requires more time at the outset, but it eliminates costly mistakes and redesigns at later stages.”

A typical AI deployment project in an on-premise SAP environment carried out by SNOK consists of several phases:

Phase 1: Analysis and design (2-4 weeks) • Inventory of existing SAP systems • Identification of use cases with the highest potential return on investment • Assessment of regulatory and security requirements • Design of the target architecture

Phase 2: Infrastructure preparation (2-6 weeks) • Ordering and installing GPU hardware (if required) • Configuring the container environment • Deploying the MCP server for SAP • Installing and optimising the chosen language model • Integration with authentication and authorisation systems

Phase 3: Pilot deployment (4-8 weeks) • Launching the AI assistant for a selected group of users • Gathering feedback • Fine-tuning the model and optimising system prompts • Verifying theoretical assumptions in practice

Phase 4: Production rollout and optimisation (depending on scale) • Gradually extending access to further groups of users • Monitoring performance and response quality • Training for end users • Documentation and knowledge transfer to the client’s IT team

“We do not leave clients on their own after implementation,” adds Michał Korzeń. “We offer maintenance packages covering model updates, performance optimisation, expansion to new use cases, and technical support. AI is not a one-off project - it is a continuous journey, and we want to be a partner on that path.”

Integrating AI with on-premise SAP is no longer an experiment or a vision of the future - it is a mature technological path with well-defined architectural patterns, available hardware and software, and proven implementation methodologies.

Three key breakthroughs in recent years have made this integration practical:

• Standardisation of MCP - eliminates the problem of combinatorial explosion of connectors and ensures long-term interface stability

• Maturity of the NVIDIA AI Enterprise platform - delivers a complete technology stack for running AI models locally, from inference libraries to security mechanisms

• Availability of local language models - with particular emphasis on the Polish Bielik model, enabling the construction of solutions fully independent of foreign cloud service providers

For Polish organisations - particularly those operating in regulated sectors or handling sensitive data - the ability to deploy AI without compromising data sovereignty is of fundamental importance. Bielik-11B offers quality comparable to 70-billion-parameter models in Polish-language tasks, with hardware requirements that allow it to run even on a CPU. Combined with the MCP server for SAP, dedicated NVIDIA hardware and proven architectural patterns, this creates an ecosystem that is not inferior to cloud solutions - and in many respects surpasses them.

“AI in on-premise SAP is not a compromise or a workaround - it is a deliberate architectural choice that, for many organisations, is the only sensible path,” concludes Michał Korzeń. “SNOK has the competencies, partnerships and experience to guide clients through the entire process - from needs analysis, through technology selection, architecture design, deployment, and on to maintenance and development. I invite any organisation looking to harness the potential of AI within its SAP environment, while retaining full control over its data, to get in touch.”

SNOK is a Polish IT consulting firm whose team combines over 25 years of cumulative experience in SAP systems, cybersecurity and intelligent automation. As an SAP Silver Partner, a UiPath Agentic Automation Fast Track Partner, and the official representative of SecurityBridge in Poland, SNOK combines deep technological expertise with practical implementation experience. Its ISO 9001 and ISO 27001 certifications confirm its commitment to quality and security in the projects it delivers.

If your organisation is considering deploying AI within an on-premise SAP environment, please get in touch. Our experts will help assess the possibilities, design the architecture, and carry out the implementation - from the initial analysis through to production go-live.

document.getElementById(“page”).classList.add(“newLayout”);

Would you like to see this in practice or discuss implementation at your organisation? Get in touch - we will respond within 48 hours.

Topics:Tech ThursdayAI automation SecurityBridge SAP S/4HANA SAP BTP SAP HANA SAP Joule

Found this useful? Please pass it on:

How to connect AI to on-premise SAP systems without losing control over your data

document.getElementById(“page”).classList.add(“newLayout”);

Tech Thursday

HITL gates in UiPath Maestro and AI Trust Layer: where agent autonomy ends and human decisions begin

When NOT to automate - 4 signs a process will kill your RPA project

From idea to an agent in production - what building an agentic solution really looks like

Get in touch