Designing a Community Health AI System for Rural Bangladesh: A Technical Architecture Study

Context: a design engagement with a health NGO

In 2024, Arko IT Services ran a technical architecture and feasibility study for a health NGO operating across six rural districts in Bangladesh. The organization had about 1,200 Community Health Workers (CHWs) doing maternal health, childhood illness, and nutrition monitoring across a combined population of roughly 1.8 million people.

The challenge was specific. They had run a paper-based data collection and decision support system for 15 years. The paper system worked. CHWs were trained on it, supervisors trusted it, and the data fed district health planning. But it was slow, it could not support real-time analytics, and it was throwing off data quality problems as the geographic footprint grew.

They wanted to know whether an AI-assisted digital system could replace or augment the paper without breaking CHW workflows that had taken years to establish.

The problem framing

Before designing anything, the team spent two weeks in the field: shadowing CHW home visits, sitting in on supervisor meetings, and reading through the paper forms. Four things stood out.

CHW literacy and device proficiency vary a lot. Senior CHWs with 10-plus years on the job were fluent with the paper system but had barely touched a smartphone. Newer CHWs were comfortable with phones but less sure of the clinical protocols. The system had to work for both.

Connectivity is genuinely unreliable, not just occasionally slow. In three of the six districts, connectivity dropped to zero for days at a time during monsoon season. Any digital system that needed connectivity for core functions would fail exactly when health risks peaked.

The paper system had safety nets that digital replacements tend to strip out. The CHW's paper register did several jobs at once: patient record, visit prompt, supervisor audit trail, and accountability mechanism. A digital replacement had to keep all of them.

The district health office needed aggregate analytics, not just digitized records. District health officers were spending 8 to 10 hours a week each on manual data aggregation. For the management layer, that was the most painful problem on the list.

The architecture design

Design principle 1: offline-first, always

Every core CHW workflow runs entirely on-device. The app carries a local SQLite database with the CHW's full household register, clinical protocol decision trees, and a 90-day history of visit records.

Connectivity is used for four things only:

Syncing new visit records to the central database
Pulling down updated clinical protocols
Escalating complex cases to a clinician reviewer
Generating supervisor reports

If connectivity is gone for days, the CHW keeps working as normal. When it comes back, sync runs automatically with conflict resolution.

Design principle 2: voice input for low-literacy users

The input interface is voice-first in Bengali, with touch as the fallback. A CHW speaks her observations into the device, and a Bengali ASR model transcribes them and maps them onto the relevant protocol fields.

For feature phone users (about 15% of the CHW population in this deployment), a USSD interface handles basic data submission and emergency escalation with no smartphone needed.

CHW Visit Workflow:

  1. Open household record
     [Touch or voice: "Open Fatema's record"]
           |
           v
  2. Select visit type
     [ANC / Under-5 / Nutrition / Follow-up]
           |
           v
  3. Voice-guided data entry
     "What is the child's temperature?"
     CHW speaks answer -> ASR -> structured field
           |
           v
  4. On-device protocol check
     [Lightweight decision tree - offline]
           |
           v
  5. Risk classification
     [Low / Moderate / High / Emergency]
           |
     High/Emergency: alert + escalation prompt
     Low/Moderate: standard recommendation
           |
           v
  6. Record saved locally
     [Sync when connectivity available]

Design principle 3: two-tier AI architecture

Tier 1 is an on-device lightweight model. It handles routine triage with a quantized decision tree that runs on a 2GB RAM Android device with no connectivity, calibrated against IMCI and Bangladesh ANC protocols. It is explainable by design.

Tier 2 is cloud-based clinical reasoning. It handles complex cases escalated from Tier 1, using a larger language model fine-tuned on clinical knowledge, and it is only reachable when there is connectivity.

The design deliberately keeps Tier 2 out of routine decisions, for two reasons: cost, and CHW trust.

Design principle 4: district analytics pipeline

graph LR
    subgraph CHW_LAYER["CHW Devices - 1,200 workers"]
        D1[Device 1]
        D2[Device 2]
        D3[Device N]
    end

    subgraph SYNC["Sync Layer"]
        S[Regional Sync Server]
        D1 -->|Encrypted batch sync| S
        D2 -->|Encrypted batch sync| S
        D3 -->|Encrypted batch sync| S
    end

    subgraph ANALYTICS["District Analytics"]
        S --> ANON[Anonymization Layer - PII removed]
        ANON --> DW[District Data Warehouse]
        DW --> DASH[District Health Dashboard]
        DW --> OUTBREAK[Outbreak Detection Engine]
        DW --> REPORT[Automated Weekly Reports]
        OUTBREAK --> ALERT[District Health Officer Alert]
    end

    subgraph QUALITY["Quality Assurance"]
        DW --> SUPER[Supervisor Visit Audit]
        SUPER --> CHW_PERF[CHW Performance Reports]
        CHW_PERF --> TRAINING[Training Need Identification]
    end

The analytics pipeline wiped out the 8 to 10 hours a week each district health officer was spending on manual data aggregation.

Tradeoffs navigated

Accuracy versus explainability. A black-box deep learning model might beat a decision tree on clinical triage accuracy. But a CHW cannot explain a black box, and a tool she cannot explain is a tool she will not trust or use. We picked explainable models for Tier 1 and accepted a small accuracy hit to get something that would actually be deployed.

Feature richness versus simplicity. The digital system could collect far more data than the paper one. We recommended against it. Piling on data collection without adding CHW value is the fastest route to non-compliance. The system collects the same core fields plus three high-value additions, and nothing else.

Central cloud versus regional deployment. Data is processed in a South Asia Azure region rather than routed to European or North American data centers. That cuts latency, data sovereignty risk, and cost in one move.

Projected outcomes

From the feasibility study projections, to be validated in the pilot:

CHW visit documentation time drops from 12 to 15 minutes per household to 6 to 8
ANC coverage visibility moves from monthly aggregate reports to real-time tracking by household
Outbreak detection latency drops from 2 to 3 weeks to 48 to 72 hours
District health officer reporting goes from 8 to 10 hours a week to automated weekly reports
Supervisor audit moves from spot-check sampling to a complete visit trail

The pilot covers three of the six districts, with a 6-month evaluation period before any call on a full rollout.