Cloud AI vendors offer strong general-purpose models. Every one of them processes queries on their own infrastructure.
That means patient data travels to a third-party server every time a clinician uses the tool. Your compliance team has read those data processing agreements. On-prem deployment isn't a preference. It's what keeps PHI where it belongs, under your control, on your network.
4MINDS runs inside your hospital network. PHI trains your clinical model. Nothing crosses the perimeter.
How Healthcare teams use 4MINDS
Physicians spend roughly two hours per day on documentation for every eight hours of patient care. 4MINDS reads structured and unstructured clinical notes, lab results, and prior visit records and produces a concise patient summary at the start of each appointment. PHI stays on the hospital's own Kubernetes cluster. The physician walks into the room already oriented.
Graph RAG builds a knowledge graph of drug interactions, contraindications, and formulary rules. When a clinician asks whether a patient can safely take a new medication given their current medications and renal function, the model traverses the graph, surfaces the relevant interactions, and returns a cited answer with the source clinical guidelines. Every answer traces to the corpus.
Insurance prior authorization requires pulling the right clinical criteria, matching patient records, and writing a letter that meets payer requirements. 4MINDS automates the draft: the model reads the payer criteria, the patient record, and the treating physician notes, then produces a prior authorization letter ready for physician review. What takes 45 minutes takes 8 minutes.
The AMA estimates prior authorization consumes an average of 14 hours of physician and staff time per week per practice. 4MINDS reduces per-request time by approximately 80%, bringing weekly volume back under three hours.
What your compliance team will ask — and what the architecture answers
Correct, and 4MINDS is designed for that constraint. PHI stays on your Kubernetes cluster. There is no API call to an external model, no data processing agreement required with a cloud AI provider, because the model runs inside your perimeter. The architecture review your compliance team needs to see is a deployment diagram, not a vendor DPA.
Every ghost weights model update passes an eval gate before going live. You configure the evaluation criteria, including accuracy on clinical benchmark tasks specific to your use cases. If the eval fails, the current production model continues to run. The model does not update without passing your quality gate.
Inference, fine-tuning, and knowledge retrieval run entirely inside your hospital infrastructure. No patient data, no clinical notes, no PHI reaches an external API. Your compliance team controls the infrastructure. The vendor doesn't have a copy.
The model continuously trains on your clinical notes, treatment protocols, and formulary updates, inside your perimeter. When protocols change, the model learns from the new version automatically. No scheduled retraining project. No sending data off-site for a fine-tuning run.
Before any updated model version reaches clinicians, it passes a quality evaluation against your configured benchmarks. Every update is timestamped. Your team has a documented record of what model was running, when, and what it passed, before it went live.
Drug interaction databases, clinical guidelines, structured patient records, and institutional protocols organized as a knowledge graph. A query connecting a diagnosis to a protocol to a contraindication to a patient's medication history traverses the graph. Flat vector search gives you the most similar document. Graph RAG gives you the connected answer.
Key differentiators for healthcare
- ›PHI residency by architecture: The model trains on your patient data inside your network. PHI never moves to a vendor's infrastructure because inference and training run on your Kubernetes cluster.
- ›Clinical currency without retraining sprints: Ghost weights updates the model as treatment protocols and formulary changes arrive, automatically, without a scheduled fine-tuning project.
- ›Eval gate before clinical deployment: No updated model reaches a clinician workflow without passing a quality benchmark you configure. Every version is timestamped and documented.
See how 4MINDS handles healthcare requirements.
30-minute technical walkthrough. On-prem deployment. No pitch deck.
Health systems using 4MINDS avoid the PHI egress question because the architecture answers it before it comes up. There's no data processing agreement that allows PHI to leave, because PHI doesn't leave.
Three clinical use cases — and what PHI data residency actually requires
Health systems converging on AI are landing on three use cases that generate the most measurable impact:
Clinical documentation assistance. Discharge summaries, prior authorization letters, clinical notes, referral correspondence. Useful AI assistance at scale requires the model to understand your patient population, your clinical protocols, your formulary, and your documentation conventions — not generic medical writing. That understanding comes from training on your clinical notes and patient records. Training on PHI on a commercial AI vendor's infrastructure is a PHI transmission under HIPAA requirements. On-prem deployment means the training loop runs inside your Kubernetes cluster. PHI never crosses your network boundary.
Drug interaction and formulary advisory. Commercial LLMs have knowledge cutoffs. They cannot reflect your current formulary, your current clinical protocols, or your institution's documented patient warnings. When your pharmacy updates the formulary, the model needs to reflect that change — automatically, without a retraining project, without sending PHI off-site. Ghost Weights trains continuously on formulary updates and protocol changes inside your network. When the formulary changes, the shadow model trains on the new version. The eval gate validates quality. The swap is atomic. No downtime.
Operational forecasting. Patient census, OR scheduling optimization, ER throughput, supply chain demand. LLM-native time series for these workloads runs inside 4MINDS on your Kubernetes cluster — no separate Python stack, no additional ML vendor. The data feeding these models — historical census, discharge patterns, utilization records — is PHI-adjacent or PHI-containing. It processes inside your network.
What PHI residency actually requires
HIPAA's minimum necessary standard applies to every disclosure of PHI. Routing a patient's clinical context to a commercial AI vendor for inference is a disclosure. A Business Associate Agreement creates contractual liability — it does not prevent the disclosure or make it compliant. OCR has been consistent: a BAA shifts accountability, it does not authorize unnecessary PHI transmissions.
For clinical use cases where the AI needs PHI to be useful — which is most of them — on-prem deployment is the only architecture that keeps PHI inside your network while allowing the model to learn from it.
Joint Commission standards for clinical decision support tools require documentation that AI systems used in patient care meet quality standards before deployment. The eval gate in 4MINDS produces timestamped quality records for every model update: what changed, when, what benchmark it passed before going live. Your compliance team gets that documentation automatically.
How 4MINDS handles this architecturally
Ghost Weights trains on your clinical notes, treatment protocols, and formulary updates inside your Kubernetes cluster. The shadow model trains on updated data. The eval gate runs. The atomic swap puts the updated model into production. The version record is retained. Your IT and compliance teams can answer "what model was running during this patient interaction?" at any time without a manual audit process.
Graph RAG organizes drug interaction databases, formulary structures, clinical guidelines, and patient record schemas as a knowledge graph. A clinical query connecting current medications to a potential interaction to the current formulary status traverses the graph at query time. Flat vector search returns similar text. Graph RAG returns the connected clinical picture.
Air-gapped deployment is supported for health systems that require zero external connectivity — no outbound calls to 4MINDS or any external service after initial setup. Inference and model updates run entirely inside your hospital network.
"Our cloud AI vendor has a Business Associate Agreement. Is that sufficient?"
A BAA is a liability document. It determines who is accountable in a breach. It does not prevent PHI transmission. OCR's position is clear: a BAA does not authorize unnecessary PHI disclosures, it governs the consequences when they occur. On-prem deployment makes the BAA question irrelevant. PHI never moves, so there is no disclosure to evaluate and no transmission to govern.
Ready to see this in your environment?
30-minute technical walkthrough. On-prem deployment. No pitch deck.
We'll walk through the data architecture with your compliance lead before any demo.