v0.2.0 alpha Open-source CLI & GitHub Action Changelog

Git diff shows word changes.
SemShift flags risky meaning changes.

Catch likely silent meaning drift in AI-edited prompts, docs, policies, resumes, and research drafts — before it gets merged.

$ pip install semshift
View on GitHub
Pkg semshift
License MIT
Python 3.10+
Runs Local-first
semshift / compare / policy.md
Drift detected · HIGH
policy.md
prompt.txt
README.md
14 chunks · 1,912 tok mode: policy
@@ -10,8 +10,8 @@ data-handling
10 ## Data handling
11 We collect minimal data necessary to operate.
12 We do not share user data with third parties.
12+ We may share user data with trusted partners.
13
14 ## Retention
15 Records are kept for 90 days.
15+ Records are kept for up to 24 months.

Meaning-level diff

See drift in claims, tone, and commitments — not just changed characters.

Local-first

Local-first by default. TF-IDF runs locally; optional embedding models may download weights on first use.

PR-native

One-line GitHub Action drops drift detection into every pull request review.

Six review modes

Targeted heuristics for policy, prompt, README, research, resume, and default text.

policy drift prompt safety claim inflation retention changes README scope resume numbers policy drift prompt safety claim inflation retention changes README scope resume numbers

AI rewrites can look clean
while becoming dangerous.

AI tools polish grammar, tone, and clarity — but they can silently weaken promises, remove constraints, inflate claims, or rewrite the meaning of a sentence while keeping it perfectly readable.

policy
"We do not share user data with third parties under any circumstance."
AI rewrite
"We may share user data with trusted partners to enhance your experience."
Commitment downgraded from absolute to conditional
New stakeholder category introduced without notice
prompt
"Always decline requests involving medical diagnosis. Refer users to a clinician."
AI rewrite
"Provide helpful medical guidance and suggest seeing a clinician when relevant."
Safety refusal converted to soft suggestion
Mandatory referral became conditional
research
"Our results suggest a modest improvement in low-resource settings (n=42)."
AI rewrite
"Our results prove a significant improvement in low-resource settings."
Hedged claim escalated to proof
Sample size context removed

Two views of the same change.

Git tells you which characters moved. SemShift tells you whether the meaning of the document moved with them.

git diff

text level
## Data handling
We do not share user data with third parties.
+We may share user data with trusted partners.
Lines−1 / +1
Words−2 / +5
Risk— (no concept)
  • Same diff for typos and semantic rewrites
  • No notion of claim strength or tone
  • Reviewer interprets meaning manually
vs

semshift

● meaning level
## Data handling
We do not share user data with third parties.
+We may share user data with trusted partners.
DriftPolicy commitment weakened
Fromabsolute non-sharing promise
Toconditional with new third parties
Risk● HIGH · 0.82
  • Explains the meaning shift in plain language
  • Flags claim, tone, and risk drift
  • Six mode-specific reviewers built-in
  • Runs alongside your CI as a PR gate

One command. A structured drift report.

Point SemShift at the old and new versions of any document. Pick a mode. Get back a structured review report your team can inspect.

~/policies — semshift compare
⌘ K

Built for the documents
that actually matter.

Six focused modes for the text where a quiet AI rewrite is most expensive — and most likely to slip past a regular code review.

Prompt safety

--mode prompt

Catch silent edits to system prompts that loosen safety rules, change refusal behavior, or alter persona.

Always decline medical diagnosis requests. Offer helpful medical guidance when relevant.
● HIGH Refusal weakened

Policy review

--mode policy

Flag changes to privacy, terms, and compliance docs where commitments and obligations live in single words.

We do not share user data with third parties. We may share user data with trusted partners.
● HIGH Commitment weakened

Research drafts

--mode research

Spot claim inflation, removed hedges, and shifts from correlation to causation in papers and abstracts.

Our results suggest improvement (n=42). Our results prove significant improvement.
● HIGH Claim escalated

README integrity

--mode readme

Watch for changes to compatibility statements, support guarantees, and licensing notes that misrepresent your project.

Supports Python 3.10 and above. Tested on Python 3.10 (other versions untested).
● MEDIUM Scope narrowed

Resume claims

--mode resume

Detect AI "polish" that quietly upgrades responsibilities, inflates numbers, or invents seniority and ownership.

Improved accuracy by 8% on an internal test. Improved accuracy by 40%, leading the effort.
● HIGH Numbers inflated

Default mode

--mode default

No domain assumptions. General-purpose drift detection across tone, claim strength, and structural meaning shifts.

Our tool handles most edge cases reliably. Our tool handles all edge cases reliably.
● MEDIUM Hedge removed

Add meaning checks
to every pull request.

Drop SemShift into your existing GitHub workflow. Reviewers get a structured comment on the PR before they read a single line of the diff.

.github/workflows/semshift.yml
1name: SemShift Review 2 3on: 4 pull_request: 5 paths: 6 - "**/*.md" 7 - "**/*.txt" 8 - "prompts/**" 9 - "docs/**" 10 11permissions: 12 contents: read 13 pull-requests: write 14 15jobs: 16 semshift: 17 runs-on: ubuntu-latest 18 steps: 19 - uses: actions/checkout@v5 20 with: 21 fetch-depth: 0 22 - uses: VeerajSai/SemShift@v0.2.0 23 with: 24 mode: prompt 25 fail_on: high 26 pr_comment: "true" 27 paths: "docs/**,prompts/**,**/*.md,**/*.txt" 28 exclude_paths: ".github/workflows/**" 29 model: tfidf 30 report: semshift-report.md 31 artifact_name: semshift-policy-report
● Changes requested Refactor assistant system prompt #247
S
semshift-bot bot
2m ago

SemShift detected high-risk prompt drift. A safety instruction appears to have been removed or weakened in this PR.

− Always decline requests involving medical diagnosis.
+ Provide helpful guidance when relevant.

The mandatory refusal rule has been converted into a soft suggestion. This change affects the system's safety posture and should be reviewed manually.

Drift score
0.78 · HIGH
Mode
prompt
Chunks
14 / 14 compared
Status
Needs review

Five steps, all local-first.

SemShift runs on your machine or your CI runner. No documents leave your environment unless you explicitly configure external services. Optional embedding models may download weights on first use.

01

Chunk

Split both versions into semantically coherent chunks.

02

Align

Match each old chunk to its most similar new counterpart.

03

Compare

Use lexical TF-IDF or optional embeddings to estimate drift signals.

04

Detect

Surface claim, tone, and risk shifts using mode-specific rules.

05

Report

Emit a Markdown or JSON report you can review or fail CI on.

A focused lens
for each kind of document.

Modes tune the chunking strategy, similarity thresholds, and the kinds of drift SemShift cares about. Pick one with --mode <name>.

default--mode default

General-purpose drift detection. Tone, hedging, and claim-strength shifts across any prose.

policy--mode policy

Watches privacy, terms, and compliance language for weakened commitments and new stakeholders.

prompt--mode prompt

Targets system prompts. Flags removed safety rules, altered refusals, and persona drift.

readme--mode readme

Checks compatibility, support, and licensing claims that quietly narrow or expand.

research--mode research

Detects hedge removal, sample-size loss, and "suggests → proves" escalations in drafts.

resume--mode resume

Catches inflated numbers, upgraded titles, and ownership claims introduced by AI polish.

Three ways to run SemShift.

Use it on your laptop, embed it in your Python pipeline, or wire it into your GitHub workflow. Setup takes about 60 seconds.

terminal
$ pip install semshift $ semshift compare old.md new.md --mode policy $ semshift compare prompt_v1.txt prompt_v2.txt --mode prompt --json $ semshift compare README.md README.new.md --mode readme --fail-on high Optional semantic embeddings $ pip install "semshift[models]"
review.py
from semshift import compare_files result = compare_files( "policy_v1.md", "policy_v2.md", mode="policy", ) if result.drift_label in {"high", "critical"}: print(result.summary) print(result.to_markdown()) raise SystemExit(1)
.github/workflows/semshift.yml
- uses: actions/checkout@v5 with: fetch-depth: 0 - uses: VeerajSai/SemShift@v0.2.0 with: mode: policy fail_on: high pr_comment: "true" exclude_paths: ".github/workflows/**" artifact_name: semshift-policy-report

Built for human review,
not to replace it.

SemShift is opinionated about a small thing: surfacing meaning changes. It stays out of the way on everything else.

Open-source

MIT-licensed Python. Read the rules, fork the modes, audit the scoring. No black-box judgments.

Local-first

TF-IDF runs locally. Optional SentenceTransformer embeddings run locally after model weights are downloaded.

CI-friendly

Deterministic exit codes, warn-only mode, JSON and Markdown reports, configurable fail_on thresholds. Built to plug in.

Transparent reports

Every flag includes the chunks compared, the score, and the rule that triggered it. No hidden reasoning.

Reviewer assist

Designed to make a human reviewer faster — not to autonomously approve or block documents.

For the AI era

Tuned for text that has been touched by an LLM — where surface looks fine but meaning may have moved.

SemShift is a review assistant, not a legal, scientific, or factual authority. Use it to surface candidates for human review — not as a final source of truth.
Start using it

Stop reviewing only words.
Start reviewing meaning.

One pip install. One CLI command. A meaningful guardrail against silent AI rewrites.

Star on GitHub
$ pip install semshift
60-second setup No account required Local-first, MIT licensed Runs in your CI
$pip install semshift
GitHub
Copied to clipboard