Exception-based attendance: stop storing a row for every student who showed up

When you build any kind of attendance or check-in feature, the obvious data model is one row per person per session: (session_id, student_id, status). Mark everyone, store everyone. It works on day one and quietly becomes a problem by month three.

For a class of 30 students meeting 20 times a month, that's 600 rows a month — and ~95% of them say the same thing: present. You're paying storage, write cost, and UI friction to record the thing that almost always happens.

Model the exception, not the norm

The fix is to flip the default. Present is assumed. You only store the deviations:

Attendance exception: (session_id, student_id, status)
  where status ∈ { absent, late, excused }

Anyone enrolled in the batch for that session who has no exception row is present. To reconstruct attendance for a session you compute:

present(session) = roster(session.batch, session.date) − exceptions(session)

The wins are immediate:

Storage collapses. Two kids skipped class? Two rows. Not thirty.
Marking is fast. The tutor taps the 2 absentees instead of confirming 30 names. In practice this is the difference between a 10-second task and a tedious one nobody keeps up with.
The UI gets simpler. Default state is "all present," and you only surface what changed.

The part people get wrong

The catch is roster(batch, date). You can't just use "current members," because membership changes over time — students join mid-term, pause for a vacation, or leave. If you resolve the roster from today's membership, your historical attendance silently rewrites itself.

So membership needs to be temporal, not a boolean. Store the lifecycle:

membership: (student_id, batch_id, joined_on, events[])
  events: [{ type: paused|resumed|left, on: date }]

Now roster(batch, date) means "students whose membership was active on that date" — and a student paused for February correctly drops out of February's sessions without deleting anything. Attendance from six months ago still reconstructs exactly as it happened.

Why this matters even more offline-first

I ran into this building Mentor Batch, a coaching-center manager for Indian tutors (Flutter + Firestore, offline-first so it works in areas with flaky connectivity). When your writes have to sync over a spotty connection, not writing 28 redundant "present" records per session per day isn't a nice-to-have — fewer writes means less to sync, fewer conflicts, and lower Firestore cost. Modeling the exception paid off twice.

The same idea shows up in billing

Once you start looking for "store the deviation, derive the norm," you see it elsewhere. Mentor Batch's fee ledger uses a cousin of this idea for payments: instead of asking the tutor to mark each month paid, you record a single lump-sum payment and auto-allocate it oldest-charge-first across outstanding dues, generating the per-month breakdown deterministically. The source of truth is the charges and the payments; the "what's paid" view is derived. Same principle — keep the inputs minimal and canonical, compute the rest.

Takeaway

Before you add a status column for every entity in every period, ask which value is the default. If one value dominates, don't store it — store the exceptions and derive the rest from a canonical, time-aware source. Your tables stay small, your writes stay cheap, and your history stays honest.

If you run a coaching center or tutor on the side, Mentor Batch is free for up to 3 batches and 15 students (Android or web) — but the attendance pattern above is yours to steal regardless of what you build with.