X For You algorithm, line by line · Part 14

X For You algorithm, line by line — Part 14: Ad blending

Part 14 of the deep dive into xai-org/x-algorithm. The home-mixer/ads/ module: SafeGapAdsBlender (preserves organic order, fills gaps), PartitionOrganicAdsBlender (sandwich pattern with brand-safety partitioning), spacing inference from ad-service positions, three-rule adjacency enforcement (BSR / handle / keyword). Last session before we leave Rust for Python Phoenix.

May 15, 2026·17 min read

The home-mixer/ads/ module handles ad blending into the feed: deciding where to inject ads, applying brand-safety adjacency rules, and choosing between two competing blending strategies. This is the last home-mixer/ directory we read — after this we move into the Python Phoenix code.

Files covered (533 LOC):

home-mixer/ads/
├── mod.rs                          (20)   AdsBlender trait + re-exports
├── util.rs                         (228)  shared helpers (spacing, safe gaps, brand-safety enforcement)
├── safe_gap_blender.rs             (95)   strategy 1: place ads at safe gaps
└── partition_organic_blender.rs    (190)  strategy 2: partition organic posts around ads

From Session 10's BlenderSelector we know which strategy runs is decided by the AdsBlenderType feature switch: "safe_gap"SafeGapAdsBlender; anything else → PartitionOrganicAdsBlender.


mod.rs (20 lines)

mod partition_organic_blender;
mod safe_gap_blender;
pub(crate) mod util;

pub use partition_organic_blender::PartitionOrganicAdsBlender;
pub use safe_gap_blender::SafeGapAdsBlender;

use util::{record_ad_risk_stats, record_post_verdict_stats};
use xai_home_mixer_proto::{FeedItem, ScoredPost};
use xai_recsys_proto::AdIndexInfo;

pub trait AdsBlender: Send + Sync {
    fn blend_inner(&self, scored_posts: Vec<ScoredPost>, ads: Vec<AdIndexInfo>) -> Vec<FeedItem>;

    fn blend(&self, scored_posts: Vec<ScoredPost>, ads: Vec<AdIndexInfo>) -> Vec<FeedItem> {
        record_post_verdict_stats(&scored_posts);
        record_ad_risk_stats(&ads);
        self.blend_inner(scored_posts, ads)
    }
}

Module declarations + re-exports. The interesting part: the AdsBlender trait.

The trait has two methods:

  • blend_inner — the implementation hook each concrete blender overrides.
  • blend — the public entry. Calls record_*_stats first, then delegates to blend_inner.

This is the Template Method pattern in Rust: the trait provides the public flow, the implementer fills in the variable middle. Stats happen automatically for both strategies. From Session 10's BlenderSelector, callers always call blend, never blend_inner.

pub(crate) mod util — util is internal to the crate but accessible from sibling modules. The two stats functions are imported but not re-exported.


util.rs (228 lines)

The shared utilities used by both blenders. Six categories:

  1. Constants and types.
  2. Brand-safety predicates.
  3. Spacing computation.
  4. Adjacency enforcement (BSR / handle / keyword).
  5. Building / interleaving helpers.
  6. Stats emission.

Constants and types

use crate::params::RESULT_SIZE;
use std::sync::LazyLock;
use xai_home_mixer_proto::{BrandSafetyVerdict, FeedItem, ScoredPost, feed_item};
use xai_post_text::TweetTokenizer;
use xai_recsys_proto::{AdIndexInfo, BrandSafetyRiskLevel};
use xai_stats_receiver::global_stats_receiver;

static TWEET_TOKENIZER: LazyLock<TweetTokenizer> = LazyLock::new(TweetTokenizer::new);

pub(crate) const MIN_POSTS_FOR_ADS: usize = 5;

pub(crate) const MIN_REQUESTED_GAP: usize = 3;

pub(crate) const DEFAULT_SPACING: AdSpacing = AdSpacing {
    requested: 3,
    min: 2,
};

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub(crate) struct AdSpacing {
    pub(crate) requested: usize,
    pub(crate) min: usize,
}

Important constants:

  • MIN_POSTS_FOR_ADS = 5 — don't inject ads if fewer than 5 posts. Avoid ad-heavy responses on sparse feeds.
  • MIN_REQUESTED_GAP = 3 — gaps below this fall back to default. Sanity guard against ad index returning nonsense.
  • DEFAULT_SPACING: requested 3 (ideal spacing), min 2 (hard floor). So ads land 2-3 posts apart.

AdSpacing is the two-field config that drives placement.

TWEET_TOKENIZER: LazyLock<TweetTokenizer> — global lazy-initialized tokenizer. One tokenizer instance across the entire process, built on first access. Used in keyword-matching below.

Brand-safety predicates

pub(crate) fn has_avoid(post: &ScoredPost) -> bool {
    post.brand_safety_verdict() == BrandSafetyVerdict::MediumRisk
}

has_avoid — true if the post is MediumRisk (the most cautious verdict from Session 04's compute_verdict). Ads should avoid being adjacent to these. Note Safe and LowRisk are both OK.

pub(crate) fn find_safe_gaps(scored_posts: &[ScoredPost]) -> Vec<usize> {
    let n = scored_posts.len();
    let mut safe = Vec::new();
    for g in 1..n {
        if has_avoid(&scored_posts[g - 1]) {
            continue;
        }
        if g < n && has_avoid(&scored_posts[g]) {
            continue;
        }
        safe.push(g);
    }
    safe
}

find_safe_gaps — for each position g in 1..n, an ad could be inserted before position g if neither posts[g-1] (above) nor posts[g] (below) is has_avoid. Returns the list of safe gap indices.

So gap 1 = "between post 0 and post 1." Both posts at the boundary must be safe.

This is SafeGapAdsBlender's primary mechanism: ads only go at safe gaps.

Spacing computation

pub(crate) fn compute_spacing(ads: &[AdIndexInfo]) -> AdSpacing {
    if ads.len() < 2 {
        return DEFAULT_SPACING;
    }

    let mut positions: Vec<i32> = ads.iter().take(4).map(|a| a.insert_position).collect();
    positions.sort_unstable();

    let min_diff = positions
        .windows(2)
        .map(|w| (w[1] - w[0]) as usize)
        .filter(|&d| d > 0)
        .min();

    match min_diff {
        Some(requested) if requested >= MIN_REQUESTED_GAP => AdSpacing {
            requested,
            min: requested.div_ceil(2),
        },
        _ => DEFAULT_SPACING,
    }
}

The ad service returns each ad with an insert_position — its preferred placement. We use the minimum gap between adjacent preferred positions as the requested spacing.

The logic:

  1. Take the first 4 ads' positions.
  2. Sort.
  3. Compute consecutive differences (windows(2)).
  4. Filter out zeros (duplicate positions).
  5. Pick the minimum.

If the min diff is ≥ MIN_REQUESTED_GAP (3), use it. The min (hard floor) is requested / 2 ceiling. So requested=4 → min=2, requested=5 → min=3.

If too small or no diff, fall back to DEFAULT_SPACING.

Why ceiling div? 5 / 2 = 2 integer, but (5).div_ceil(2) = 3. We want the hard floor to not collapse below half. Bias toward more spacing rather than less.

Adjacency enforcement

Three predicates check whether an ad should be dropped based on what's adjacent. All take (ad, above, below).

Brand Safety Risk (BSR)

pub(crate) fn is_bsr_low_ad(ad: &AdIndexInfo) -> bool {
    let risk = ad
        .ad_adjacency_control
        .as_ref()
        .map(|c| c.brand_safety_risk())
        .unwrap_or(BrandSafetyRiskLevel::BsrUnknown);
    matches!(
        risk,
        BrandSafetyRiskLevel::BsrLow | BrandSafetyRiskLevel::BsrIas
    )
}

pub(crate) fn should_drop_bsr_low(
    ad: &AdIndexInfo,
    above: Option<&ScoredPost>,
    below: Option<&ScoredPost>,
) -> bool {
    let risk = ad
        .ad_adjacency_control
        .as_ref()
        .map(|c| c.brand_safety_risk())
        .unwrap_or(BrandSafetyRiskLevel::BsrUnknown);
    if !matches!(
        risk,
        BrandSafetyRiskLevel::BsrLow | BrandSafetyRiskLevel::BsrIas
    ) {
        return false;
    }
    let is_lr = |p: &ScoredPost| p.brand_safety_verdict() == BrandSafetyVerdict::LowRisk;
    above.map(is_lr).unwrap_or(false) || below.map(is_lr).unwrap_or(false)
}

is_bsr_low_ad: does this ad have a low brand-safety-risk requirement? An ad with BsrLow or BsrIas (IAS = Integral Ad Science certification) requires extra-careful placement.

should_drop_bsr_low: drop the ad if it's a low-BSR ad AND either neighbor is LowRisk (an organic post that's not perfectly safe). So:

  • Low-BSR ads can only sit between Safe posts (both must be Safe).
  • Other ads tolerate LowRisk posts.
  • All ads avoid MediumRisk (already filtered by safe-gap selection).

This is a stricter version of the safe-gap rule for premium advertisers.

Handle (advertiser-blocklist)

pub(crate) fn should_drop_handle(
    ad: &AdIndexInfo,
    above: Option<&ScoredPost>,
    below: Option<&ScoredPost>,
) -> bool {
    let handles = match ad.ad_adjacency_control.as_ref() {
        Some(ctrl) if !ctrl.handles.is_empty() => &ctrl.handles,
        _ => return false,
    };
    above
        .map(|p| handles.contains(&(p.author_id as i64)))
        .unwrap_or(false)
        || below
            .map(|p| handles.contains(&(p.author_id as i64)))
            .unwrap_or(false)
}

The ad carries a handles list — advertiser-specified user IDs to NOT appear next to (e.g., a competitor's account). If either neighbor's author is in the list, drop the ad.

Advertisers control their adjacency via this list. Premium feature.

Keyword

pub(crate) fn should_drop_keyword(
    ad: &AdIndexInfo,
    above: Option<&ScoredPost>,
    below: Option<&ScoredPost>,
) -> bool {
    let keywords = match ad.ad_adjacency_control.as_ref() {
        Some(ctrl) if !ctrl.keywords.is_empty() => &ctrl.keywords,
        _ => return false,
    };

    let tokenizer = &*TWEET_TOKENIZER;

    let tokenized_keywords: Vec<_> = keywords
        .iter()
        .map(|kw| tokenizer.tokenize(kw))
        .filter(|seq| !seq.is_empty())
        .collect();

    if tokenized_keywords.is_empty() {
        return false;
    }

    let text_matches = |p: &ScoredPost| {
        if p.tweet_text.is_empty() {
            return false;
        }
        let tweet_tokens = tokenizer.tokenize(&p.tweet_text);
        if tweet_tokens.is_empty() {
            return false;
        }
        tokenized_keywords
            .iter()
            .any(|kw_tokens| tweet_tokens.contains_keyword_sequence(kw_tokens))
    };
    above.map(text_matches).unwrap_or(false) || below.map(text_matches).unwrap_or(false)
}

The ad carries a keywords list — content-keyword blocklist. If either neighbor's text contains any of these keywords (as token sequences, not substring), drop.

Same tokenizer + matcher pattern as MutedKeywordFilter (Session 05). The tokenizer is shared (global LazyLock).

tweet_tokens.contains_keyword_sequence(kw_tokens) does token-sequence matching — "AAPL" matches $AAPL and #AAPL but not apple-pie (because the tokenizer would split that differently).

Advertiser blocks "covid" → their ad won't appear next to any post containing "covid" as a token.

Building / interleaving helpers

pub(crate) fn posts_to_feed_items(scored_posts: Vec<ScoredPost>) -> Vec<FeedItem> {
    scored_posts
        .into_iter()
        .enumerate()
        .map(|(i, post)| FeedItem {
            position: i as i32,
            item: Some(feed_item::Item::Post(post)),
        })
        .collect()
}

Wrap each post in a FeedItem with position = index. Used when no ads will be blended.

pub(crate) fn interleave_and_finalize(
    scored_posts: Vec<ScoredPost>,
    ads: Vec<AdIndexInfo>,
    placements: &[usize],
) -> Vec<FeedItem> {
    let n = scored_posts.len();
    let mut items: Vec<FeedItem> = Vec::with_capacity(n + placements.len());
    let mut ads_iter = ads.into_iter();
    let mut pi = 0;

    for (i, post) in scored_posts.into_iter().enumerate() {
        if pi < placements.len() && placements[pi] == i {
            items.push(FeedItem {
                position: 0,
                item: Some(feed_item::Item::Ad(ads_iter.next().unwrap())),
            });
            pi += 1;
        }
        items.push(FeedItem {
            position: 0,
            item: Some(feed_item::Item::Post(post)),
        });
    }

    items.truncate(RESULT_SIZE);
    if matches!(items.last(), Some(item) if matches!(item.item, Some(feed_item::Item::Ad(_)))) {
        items.pop();
    }

    for (i, item) in items.iter_mut().enumerate() {
        item.position = i as i32;
    }

    items
}

Walk posts in order. Before pushing each post, check if the current index matches the next placement — if so, insert an ad.

Two post-processing rules:

  1. Truncate to RESULT_SIZE.
  2. If the last item is an ad, drop it. A feed ending with an ad is bad UX (looks like the feed ran out and we filled with ads).

Then renumber positions sequentially.

Stats emission

const VERDICT_METRIC: &str = "AdsBlender.post_brand_safety_verdict";
const RISK_METRIC: &str = "AdsBlender.ad_brand_safety_risk";

pub(crate) fn record_post_verdict_stats(posts: &[ScoredPost]) {
    let Some(receiver) = global_stats_receiver() else {
        return;
    };

    for post in posts {
        let label = post.brand_safety_verdict().as_str_name();
        receiver.incr(VERDICT_METRIC, &[("verdict", label)], 1);
    }
}

pub(crate) fn record_ad_risk_stats(ads: &[AdIndexInfo]) {
    let Some(receiver) = global_stats_receiver() else {
        return;
    };

    for ad in ads {
        let risk_level = ad
            .ad_adjacency_control
            .as_ref()
            .map(|c| c.brand_safety_risk())
            .unwrap_or(BrandSafetyRiskLevel::BsrUnknown);

        receiver.incr(RISK_METRIC, &[("risk", risk_level.as_str_name())], 1);
    }
}

Pre-blending distribution metrics:

  • Post verdicts (Safe / LowRisk / MediumRisk / Unspecified) — per-post counter.
  • Ad risk levels (BsrLow / BsrIas / BsrHigh / BsrUnknown) — per-ad counter.

Lets dashboards monitor the mix entering the blender, separate from what comes out.


safe_gap_blender.rs (95 lines)

The simpler blender. Find safe gaps, fit ads into them.

pub struct SafeGapAdsBlender;

impl AdsBlender for SafeGapAdsBlender {
    fn blend_inner(&self, scored_posts: Vec<ScoredPost>, ads: Vec<AdIndexInfo>) -> Vec<FeedItem> {
        blend_impl(scored_posts, ads, MIN_POSTS_FOR_ADS)
    }
}

pub(crate) fn blend_impl(
    scored_posts: Vec<ScoredPost>,
    ads: Vec<AdIndexInfo>,
    min_posts: usize,
) -> Vec<FeedItem> {
    let n = scored_posts.len();

    if ads.is_empty() || n < min_posts {
        return posts_to_feed_items(scored_posts);
    }

    let safe_gaps = find_safe_gaps(&scored_posts);
    let spacing = compute_spacing(&ads);
    let first_ideal = ads[0].insert_position.max(0) as usize;
    let placements = assign_ads_to_gaps(&safe_gaps, ads.len(), &spacing, first_ideal);

    interleave_and_finalize(scored_posts, ads, &placements)
}

Five-step algorithm:

  1. Bail if no ads or too few posts.
  2. Find safe gaps (no MediumRisk neighbors).
  3. Compute spacing from ad service.
  4. First ad's ideal position = ads[0].insert_position (the ad service's first preference, clamped to non-negative).
  5. Assign ads to gaps greedily, then interleave.
pub(crate) fn assign_ads_to_gaps(
    safe_gaps: &[usize],
    num_ads: usize,
    spacing: &AdSpacing,
    first_ideal: usize,
) -> Vec<usize> {
    let mut placements: Vec<usize> = Vec::new();
    let mut search_from: usize = 0;
    let mut prev_ideal = first_ideal;

    for _ in 0..num_ads {
        if search_from >= safe_gaps.len() {
            break;
        }

        let (ideal, min) = match placements.last() {
            None => (first_ideal, 1),
            Some(&last_actual) => {
                let ideal = prev_ideal + spacing.requested;
                let min = (prev_ideal + spacing.min).max(last_actual + DEFAULT_SPACING.min);
                (ideal, min)
            }
        };

        let gap = find_best_gap(&safe_gaps[search_from..], ideal, min);

        match gap {
            Some((offset, g)) => {
                placements.push(g);
                search_from += offset + 1;
                prev_ideal = ideal;
            }
            None => break,
        }
    }

    placements
}

The placement loop. For each ad slot:

  • First ad: ideal = first_ideal, min = 1.
  • Subsequent: ideal = prev_ideal + spacing.requested (cumulative). Min is the stricter of two constraints:
    • prev_ideal + spacing.min (don't pack tighter than min spacing).
    • last_actual + DEFAULT_SPACING.min (don't pack relative to the actual last placement either).

This double-min ensures that if an ad got placed "early" (below its ideal), the next ad doesn't get squeezed further. Maintains visual rhythm.

find_best_gap returns Some((offset, g)) where offset is the position within safe_gaps[search_from..] and g is the absolute gap index. Then advance search_from past that gap.

pub(crate) fn find_best_gap(gaps: &[usize], ideal: usize, min: usize) -> Option<(usize, usize)> {
    let min_offset = gaps.partition_point(|&g| g < min);
    if min_offset >= gaps.len() {
        return None;
    }
    let candidates = &gaps[min_offset..];
    let ideal_pos = candidates.partition_point(|&g| g < ideal);

    let chosen = if ideal_pos >= candidates.len() {
        candidates.len() - 1
    } else if ideal_pos == 0 {
        0
    } else {
        let below = candidates[ideal_pos - 1];
        let above = candidates[ideal_pos];
        if ideal - below <= above - ideal {
            ideal_pos - 1
        } else {
            ideal_pos
        }
    };

    Some((min_offset + chosen, candidates[chosen]))
}

Find the gap closest to ideal, with g >= min using binary search.

  1. partition_point(|&g| g < min) — first index in gaps where g >= min. (Since gaps is sorted ascending, returns the insertion point.)
  2. If no such position, return None.
  3. partition_point(|&g| g < ideal) within the candidates slice — first index where g >= ideal.
  4. Three cases:
    • Past the end: take the last candidate.
    • At zero (smallest candidate already ≥ ideal): take it.
    • Middle: compare the candidate just below ideal vs. just above — take whichever is closer.

Tie-breaking on equal distance (ideal - below <= above - ideal): prefer the earlier position. Bias toward higher up in the feed.

So SafeGapAdsBlender places ads at the safe gaps closest to the ad service's ideal positions, respecting min-spacing constraints.


partition_organic_blender.rs (190 lines) — strategy 2

The more sophisticated blender. Partition posts by brand safety — keep safe and unsafe ones separate, sandwich each ad between two safe posts, fill the rest with unsafe + remaining safe posts.

const ENFORCEMENT_METRIC: &str = "PartitionOrganic.enforcement";

pub struct PartitionOrganicAdsBlender;

impl AdsBlender for PartitionOrganicAdsBlender {
    fn blend_inner(&self, scored_posts: Vec<ScoredPost>, ads: Vec<AdIndexInfo>) -> Vec<FeedItem> {
        blend_impl(scored_posts, ads, MIN_POSTS_FOR_ADS)
    }
}

Same trait impl as the safe-gap version. The big logic difference is in blend_impl:

pub(crate) fn blend_impl(
    scored_posts: Vec<ScoredPost>,
    ads: Vec<AdIndexInfo>,
    min_posts: usize,
) -> Vec<FeedItem> {
    let n = scored_posts.len();

    if ads.is_empty() || n < min_posts {
        return posts_to_feed_items(scored_posts);
    }

    let spacing = compute_spacing(&ads);

    let safe_count = scored_posts.iter().filter(|p| !has_avoid(p)).count();
    let max_from_safe = safe_count / 2;
    let expected_from_spacing = if spacing.requested > 0 {
        n.saturating_sub(1) / spacing.requested
    } else {
        0
    };
    let actual_ads = ads.len().min(expected_from_spacing).min(max_from_safe);

    if actual_ads == 0 {
        return posts_to_feed_items(scored_posts);
    }

Compute how many ads to actually place: take the min of three numbers:

  • Number of ads available.
  • Expected from spacing: (n - 1) / spacing.requested. For 30 posts, spacing 3 → 9 ads max.
  • safe_count / 2: each ad needs 2 safe posts (sandwich), so capped at half the safe count.

saturating_sub(1) avoids underflow on n=0 (already gated above, but defensive).

    let mut safe: Vec<ScoredPost> = Vec::new();
    let mut unsafe_posts: Vec<ScoredPost> = Vec::new();
    for post in scored_posts {
        if has_avoid(&post) {
            unsafe_posts.push(post);
        } else {
            safe.push(post);
        }
    }

    let num_safe = safe.len();
    let group_size = num_safe / actual_ads;

Partition posts into safe and unsafe lists. Compute group_size = how many safe posts per ad-sandwich group.

E.g., 12 safe posts + 3 ads → group_size = 4. Each ad will sit between safe posts at offsets 0 and 1 of its group (positions 0, 4, 8 in the safe list).

    let mut safe_opts: Vec<Option<ScoredPost>> = safe.into_iter().map(Some).collect();
    let mut triples: Vec<(AdIndexInfo, ScoredPost, ScoredPost)> = Vec::new();

    let mut bsr_drop: u64 = 0;
    let mut bsr_ok: u64 = 0;
    let mut handle_drop: u64 = 0;
    let mut keyword_drop: u64 = 0;

    let mut group_idx = 0;

safe_opts: Vec<Option<ScoredPost>> — wrapping each safe post in Option lets us take() (steal) posts when building sandwiches, leaving None markers behind. The remaining Some values flow into the filler list.

triples will hold (ad, post_above, post_below) for each placed ad.

Enforcement counters for the metrics emit at the end.

    for ad in ads {
        if group_idx >= actual_ads {
            break;
        }
        let group_start = group_idx * group_size;
        let above_ref = safe_opts[group_start].as_ref();
        let below_ref = safe_opts[group_start + 1].as_ref();

        if should_drop_bsr_low(&ad, above_ref, below_ref) {
            bsr_drop += 1;
            continue;
        }
        if is_bsr_low_ad(&ad) {
            bsr_ok += 1;
        }

        if should_drop_handle(&ad, above_ref, below_ref) {
            handle_drop += 1;
            continue;
        }

        if should_drop_keyword(&ad, above_ref, below_ref) {
            keyword_drop += 1;
            continue;
        }

        let above = safe_opts[group_start].take().unwrap();
        let below = safe_opts[group_start + 1].take().unwrap();
        triples.push((ad, above, below));
        group_idx += 1;
    }

For each ad slot (up to actual_ads):

  1. Compute the slot in safe_opts (positions group_idx * group_size and +1).
  2. Apply the three adjacency checks (BSR, handle, keyword). On any drop, count it and continue to the next ad without advancing group_idx — so the next ad gets a try at the same slot.
  3. If passed, take() the two safe posts and push to triples.

Note: a drop doesn't advance the group. So if ad 0 fails BSR, ad 1 tries the same slot. If ad 1 also fails, ad 2 tries. This greedy try-until-fit is bounded by actual_ads.

    let placed_ads = triples.len();
    emit_enforcement_metrics(bsr_drop, bsr_ok, handle_drop, keyword_drop);

    if placed_ads == 0 {
        let mut all_posts: Vec<ScoredPost> = safe_opts.into_iter().flatten().collect();
        all_posts.extend(unsafe_posts);
        all_posts.sort_by(|a, b| {
            b.score
                .partial_cmp(&a.score)
                .unwrap_or(std::cmp::Ordering::Equal)
        });
        return posts_to_feed_items(all_posts);
    }

All-or-nothing on placement: if zero ads survived enforcement, fall back to post-only. Merge safe + unsafe, sort by score (post-blending order), wrap as feed items.

    let mut filler: Vec<ScoredPost> =
        Vec::with_capacity(num_safe - 2 * placed_ads + unsafe_posts.len());
    filler.extend(safe_opts.into_iter().flatten());
    filler.extend(unsafe_posts);
    filler.sort_by(|a, b| {
        b.score
            .partial_cmp(&a.score)
            .unwrap_or(std::cmp::Ordering::Equal)
    });

Build the filler list: remaining safe posts (the ones not taken into triples) + all unsafe posts, sorted by score descending.

The filler will go between ad sandwiches in score order.

    let inter_ad_gaps = placed_ads;
    let filler_per_gap = filler.len() / inter_ad_gaps;
    let remainder = filler.len() % inter_ad_gaps;
    let mut filler_iter = filler.into_iter();

    let mut items: Vec<FeedItem> = Vec::with_capacity(n + placed_ads);

    for (i, (ad, above, below)) in triples.into_iter().enumerate() {
        items.push(FeedItem {
            position: 0,
            item: Some(feed_item::Item::Post(above)),
        });
        items.push(FeedItem {
            position: 0,
            item: Some(feed_item::Item::Ad(ad)),
        });
        items.push(FeedItem {
            position: 0,
            item: Some(feed_item::Item::Post(below)),
        });

        let count = filler_per_gap + if i >= inter_ad_gaps - remainder { 1 } else { 0 };
        for _ in 0..count {
            if let Some(post) = filler_iter.next() {
                items.push(FeedItem {
                    position: 0,
                    item: Some(feed_item::Item::Post(post)),
                });
            }
        }
    }

Build the final timeline:

  • For each ad: push (above, ad, below) as a 3-element sandwich.
  • After the sandwich, push filler_per_gap filler posts (with the remainder distributed at the tail).

The remainder distribution: if i >= inter_ad_gaps - remainder — the last remainder gaps get one extra. So if filler is 13 and we have 3 ads, filler_per_gap = 4, remainder = 1. First gap gets 4, second gets 4, third gets 4+1=5.

This biases extras toward the end of the feed, which matches typical scroll behavior (users see fewer ads at the bottom).

    items.truncate(RESULT_SIZE);
    if matches!(items.last(), Some(item) if matches!(item.item, Some(feed_item::Item::Ad(_)))) {
        items.pop();
    }
    for (i, item) in items.iter_mut().enumerate() {
        item.position = i as i32;
    }

    items
}

Same finalization as the safe-gap blender:

  • Truncate to result size.
  • Strip trailing ad.
  • Renumber positions.
fn emit_enforcement_metrics(bsr_drop: u64, bsr_ok: u64, handle_drop: u64, keyword_drop: u64) {
    let Some(receiver) = global_stats_receiver() else {
        return;
    };
    if bsr_drop > 0 {
        receiver.incr(ENFORCEMENT_METRIC, &[("action", "drop")], bsr_drop);
    }
    if bsr_ok > 0 {
        receiver.incr(ENFORCEMENT_METRIC, &[("action", "ok")], bsr_ok);
    }
    if handle_drop > 0 {
        receiver.incr(
            ENFORCEMENT_METRIC,
            &[("action", "handle_drop")],
            handle_drop,
        );
    }
    if keyword_drop > 0 {
        receiver.incr(
            ENFORCEMENT_METRIC,
            &[("action", "keyword_drop")],
            keyword_drop,
        );
    }
}

Emit per-rule drop counters. Skip zeros. Lets dashboards see "we dropped N ads for keyword violations today."


What we've learned

Two ad blending strategies:

  1. SafeGapAdsBlender: find positions where neither neighbor is MediumRisk, then place ads at those positions matching the ad service's preferred positions. Simpler. Preserves original organic order.
  2. PartitionOrganicAdsBlender: split posts into safe/unsafe pools, sandwich ads between safe pairs, fill with score-sorted leftovers. More aggressive — reorders posts.

The trade-off:

  • Safe-gap preserves organic ordering (good UX for engagement signal).
  • Partition-organic maximizes ad-safe placements (more revenue from premium ads, but reorders).

Feature flag picks per request.

Brand-safety adjacency rules:

  • Safe gaps: never put an ad next to MediumRisk content (universal).
  • BSR-Low / BSR-IAS ads (premium): also avoid LowRisk content. Only Safe neighbors.
  • Handles: per-ad advertiser blocklist of user IDs.
  • Keywords: per-ad blocklist of token sequences. Uses the same TweetTokenizer (global LazyLock).

Spacing inference from ad service: the ad service returns each ad with insert_position. The min diff between the first 4 ads' preferred positions tells the blender how aggressively the ad service wants ads packed. With fallback to DEFAULT_SPACING = (3, 2).

The taken-then-flatten idiom: Vec<Option<T>> + .take() to steal selected elements, then safe_opts.into_iter().flatten() to collect what's left. Preserves order, marks "used" positions.

Trailing-ad strip: both blenders drop a trailing ad after truncation. Prevents the feed from ending on an ad — bad UX.

AdsBlender template-method: the trait's blend() records stats before calling blend_inner(). Both strategies inherit stats for free.

Enforcement metrics with sparse keys: if bsr_drop > 0 { incr(...) } — only emit counters when they have non-zero values. Keeps Prometheus tag sets cleaner.

Filler distribution: if i >= inter_ad_gaps - remainder { 1 } else { 0 } distributes the integer remainder of filler-per-gap by giving the trailing gaps the extras. Subtle UX choice: more posts toward the bottom of the response.

Why two strategies coexist: A/B test in production. Whichever wins becomes the default and the other becomes orphan code. The trait abstraction makes this clean — BlenderSelector (Session 10) just picks via feature switch.


The end of the Rust home-mixer tour

This wraps up the Rust orchestration service. Series progress so far:

Sessions Module LOC
01 candidate-pipeline framework 1,031
02-03 thunder 1,808
04-14 home-mixer ~11,700
Total Rust covered ~14,539

The For You feed pipeline is now fully read. From the gRPC entry point through every query hydrator, source, candidate hydrator, filter, scorer, post-selection hydrator, post-selection filter, side-effect, and ad blender.

Next session

Session 15 — Phoenix models (1068 LOC). We leave Rust behind and enter the Python ML code:

  • phoenix/recsys_model.py (680) — the Grok-based transformer for ranking
  • phoenix/recsys_retrieval_model.py (388) — the two-tower retrieval model

This is where the "Grok-based transformer model" mentioned in the README is actually implemented. Attention masks, candidate-isolation logic, multi-head architectures, hash embeddings — all the ML.

After Session 15: Phoenix runners, then Grok tests, then the entire grox/ content-classification pipeline (Sessions 18-22).