From Source to Activity: Advancing Forensic DNA Interpretation for Legal Precision

Leo Kelly Nov 27, 2025 447

This article examines the critical distinction between source-level and activity-level propositions in forensic science, a paradigm shift essential for providing legally relevant conclusions.

From Source to Activity: Advancing Forensic DNA Interpretation for Legal Precision

Abstract

This article examines the critical distinction between source-level and activity-level propositions in forensic science, a paradigm shift essential for providing legally relevant conclusions. Aimed at researchers, scientists, and legal professionals, it explores the foundational hierarchy of propositions, details methodological frameworks like likelihood ratios and Bayesian networks for activity-level evaluation, and addresses key implementation barriers such as data limitations and training needs. By comparing the probative value of different proposition levels and validating advanced approaches, this review synthesizes a path forward for enhancing the utility and robustness of forensic evidence in judicial contexts.

The Hierarchy of Propositions: Defining Source vs. Activity in Forensic Science

The hierarchy of propositions is a fundamental concept in forensic science, providing a structured framework for evaluating the strength of scientific evidence in a legal context. This logical framework is essential for managing uncertainty and assisting the court in its decision-making process [1]. The hierarchy helps DNA scientists reason in a balanced, robust, and transparent way, moving from simple questions about the source of a DNA profile to more complex questions about activities and offenses [1].

In contemporary forensic practice, there is a recognized shift from the traditional question of "whose DNA is this?" toward the more probative question of "how did it get there?" [2]. This evolution reflects the recognition that while the source of DNA may not be contested, the mechanism of transfer and the activities that led to the DNA being deposited are often central to the issues before the court. The hierarchy of propositions provides a systematic approach to address these different levels of questions in a logically coherent manner [2].

Theoretical Framework: Levels of Propositions

The Hierarchical Structure

The hierarchy of propositions is structured into distinct levels, each addressing different types of questions and requiring different types of information and data for evaluation. The table below outlines the core levels within this framework:

Table 1: Levels in the Hierarchy of Propositions

Level	Primary Question	Focus of Evaluation	Data Requirements
Sub-Source	Does the DNA profile originate from a specific individual?	Analytical features of the DNA profile [2].	DNA profile rarity, population data [2].
Source	Does the biological material originate from a specific individual?	Source of the biological material (e.g., blood, saliva) [1].	DNA profile data, possibly information on cellular origin.
Activity	How did the DNA transfer occur during an alleged event?	Actions and activities leading to transfer [2].	Transfer, persistence, prevalence (background) data [2].
Offense	Did the suspect commit the crime?	Ultimate issue before the court [1].	All evidence, including forensic results and case facts.

Detailed Level Description

Sub-Source Level: This is the most fundamental level, dealing exclusively with the DNA profile itself. The evaluation focuses on comparing the analytical features of a recovered DNA profile with a reference profile from a person of interest (POI). The result is typically expressed as a likelihood ratio that assesses the probability of the DNA evidence under two competing propositions: the POI is the source of the profile versus an unknown individual is the source [2].
Source Level: Building on the sub-source level, the source level addresses the origin of the biological material. Here, the question expands from the DNA profile to the biological source of the trace (e.g., "Is the bloodstain from Mr. A?"). While the DNA profile is a key component, this level may consider other information about the nature of the biological material [1].
Activity Level: This level evaluates the results given specific propositions about activities or actions. It addresses how a given trace arrived on a particular surface or item. Example propositions could be "Mr. A punched the victim" versus "Mr. A shook hands with the victim" [2]. At this level, factors beyond DNA profile rarity become critical, including the probabilities of transfer, persistence, and background presence of DNA [2]. This evaluation is more complex but often provides more probative value to the court.
Offense Level: The highest level in the hierarchy deals with the ultimate issue the court must decide, such as whether a suspect is guilty of a crime. It is widely accepted that forensic scientists should not express opinions on offense-level propositions, as these are the exclusive purview of the court and require the integration of all pieces of evidence, not just the forensic scientific findings [1].

The following diagram illustrates the logical relationships and key considerations when moving through the hierarchy of propositions from sub-source to offense level.

Application Notes: Implementing the Framework

Pre-Assessment and Case Information

The first principle of evaluative reporting emphasizes that interpretation must occur within a framework of circumstances [1]. Before conducting analyses, scientists must engage in a pre-assessment phase to understand the case context and identify the relevant propositions. This ensures that the evaluation addresses the key questions in the case and that the necessary data and analyses are obtained. The pre-assessment is particularly crucial when addressing activity-level propositions, where factors of transfer and persistence need to be considered [1].

The Role of Case Information

Case information is categorized as either task-pertinent or task-irrelevant [1]. Task-pertinent information directly influences the evaluation, such as the alleged timing of events, the nature of contact, or the environment where the trace was found. For activity-level evaluations, this information is essential for formulating realistic propositions and assigning probabilities. Task-irrelevant information does not impact the scientific evaluation and should be excluded to maintain objectivity.

Experimental Protocols for Evaluative Reporting

Protocol for Forming an Evaluative Opinion at the Activity Level

Objective: To provide a structured methodology for evaluating forensic DNA results given activity-level propositions.

Procedure:

Case Pre-Assessment:
- Review all available case information from the mandating authority.
- Identify the disputed issues and the activities alleged by the prosecution and defense.
- Determine if the scientific findings can help distinguish between the competing activity-level propositions.
Proposition Formulation:
- Define at least two mutually exclusive propositions (Hp and Hd) that relate to specific activities. Example: "The suspect assaulted the victim" versus "The suspect had innocent social contact with the victim."
- Ensure propositions are at the same hierarchical level and are clearly defined.
Relevant Data Identification:
- Identify the specific data needed to evaluate the findings under the defined propositions. For activity level, this typically includes:
  - Transfer Probabilities: The likelihood of DNA transferring given the alleged activities.
  - Persistence Data: How long DNA is expected to remain on a surface or item.
  - Background Prevalence: The probability of finding the DNA on that surface or item from an unrelated source.
- Consult relevant scientific literature, databases, and previous experimental studies to inform probability assignments.
Likelihood Ratio (LR) Calculation:
- Evaluate the probability of the findings (E) under the prosecution's proposition (Hp) and the defense's proposition (Hd).
- Calculate the LR using the formula: LR = P(E | Hp, I) / P(E | Hd, I), where I represents the framework of circumstances.
- The probabilities should be informed by the data identified in Step 3 and the analyst's specialist knowledge.
Sensitivity Analysis (if needed):
- If certain factors of the alleged activities are unknown (e.g., exact force or duration of contact), perform a sensitivity analysis to determine their impact on the LR [2].
- If the LR is highly sensitive to an unknown factor, seek additional information or present the range of possible values.
Reporting:
- Report the findings and the calculated LR in a clear, transparent, and balanced manner.
- Include all assumptions, the source of data used for probability assignments, and the limitations of the evaluation.

Key Research Reagent Solutions

The following table details essential materials and their functions for research and data generation in the field of forensic DNA interpretation.

Table 2: Essential Research Reagents and Materials for Forensic DNA Interpretation

Item/Category	Function/Application
Population DNA Databases	Provides allele frequency data necessary for calculating DNA profile rarity at the sub-source level [2].
Transfer & Persistence Studies	Empirical data on the mechanisms and rates of DNA transfer under different conditions and its persistence over time; crucial for activity level evaluation [2].
Statistical Software & Models	Enables the computation of Likelihood Ratios (LRs) and provides accepted models for evaluating DNA evidence at various levels of the hierarchy.
Forensic Interpretation Guidelines	Published guidelines (e.g., from ENFSI, OSAC) provide standardized frameworks and best practices for logical and transparent evaluative reporting [1].
Sensitivity Analysis Tools	Methods and software to test the robustness of evaluative conclusions when faced with uncertainty about specific activity parameters [2].

The hierarchical framework from sub-source to offense level provides a logical and structured approach for forensic scientists to evaluate DNA evidence. While evaluations at the sub-source and source levels are more established and rely on robust population data, the move toward activity-level evaluations is necessary to address the most relevant questions in modern criminal justice [2]. Although activity-level evaluations require more complex data on transfer, persistence, and background prevalence, these challenges can be met through controlled experimentation, careful use of available scientific knowledge, and transparent reporting of assumptions [2].

The implementation of this framework ensures that forensic reporting remains balanced, logical, transparent, and robust, ultimately enhancing its value to the justice system. By clearly understanding and applying the hierarchy of propositions, forensic scientists can provide more meaningful and probative assessments of scientific evidence, helping courts to understand the strength of DNA findings in the context of alleged activities.

Source-level propositions represent a fundamental tier in the hierarchy of propositions used in forensic science, situated between the sub-source and activity levels [3]. When forensic scientists address the question "Whose DNA is this?", they are operating primarily at the source level, seeking to identify the biological origin of a sample [2]. This differs from activity-level propositions, which investigate how the DNA was transferred to a particular location through specific actions [3] [2].

The evaluation of forensic evidence occurs at different levels within this hierarchy, with the appropriate level determined by the specific questions posed by the case and the information available for interpretation [4]. Source-level propositions are particularly suitable when transfer mechanisms are not disputed and the central issue concerns the biological origin of the trace material [5]. For example, when a large, fresh bloodstain is found at a burglary scene and a suspect claims never to have been at the premises, source-level propositions appropriately address whether the blood originated from the suspect or another unknown individual [5].

Table 1: Hierarchy of Propositions in DNA Evidence Interpretation

Level	Focus	Example Propositions
Activity	How DNA was transferred through specific actions	"The defendant filled the bottles with petrol" vs. "An unknown offender filled the bottles" [6]
Source	Biological origin of the material	"The bloodstain came from the defendant" vs. "The bloodstain came from another unknown individual" [5]
Sub-Source	Source of DNA profile specifically	"The DNA came from the victim and accused" vs. "The DNA came from the victim and an unknown individual" [3]

Experimental Protocols for Body Fluid Identification and DNA Analysis

Presumptive and Discriminatory Testing Protocol

The identification of biological materials prior to DNA analysis involves a systematic workflow that progresses from non-destructive examinations to confirmatory testing [4].

Visual Examination and Documentation: Begin with macroscopic and microscopic examination of the stain or sample under various light sources (white light, alternative light sources). Document the physical characteristics including size, shape, color, and texture.
Presumptive Chemical Testing: Apply chemical tests that react with specific components of biological fluids:
- Kastle-Meyer Test for Blood: Employ a catalytic test based on the peroxidase-like activity of hemoglobin. Apply phenolphthalein reagent to a swab with the suspected bloodstain, followed by hydrogen peroxide. An immediate pink color change indicates a positive presumptive result for blood [4].
- Acid Phosphatase Test for Semen: Utilize sodium alpha-naphthyl phosphate and Fast Blue B dye to detect acid phosphatase enzyme activity. A purple color formation within 2 minutes suggests the possible presence of semen [4].
Confirmatory Testing: Proceed with microscopic examination for spermatozoa in suspected semen stains or immunochromatographic tests for specific blood antigens to confirm body fluid identity.
DNA Sample Collection: Once biological material is identified, collect appropriate samples using sterile swabs, cutting from substrates, or tape lifting depending on the surface characteristics.

DNA Profiling and Interpretation Workflow

The process of generating and interpreting DNA profiles from biological samples follows a standardized protocol:

DNA Extraction: Isolate DNA from biological materials using commercially available extraction kits, following manufacturer protocols. Include positive and negative controls to monitor extraction efficiency and contamination.
Quantification: Precisely measure the DNA concentration using quantitative PCR methods to ensure optimal amplification and identify potential inhibitors.
PCR Amplification: Amplify targeted Short Tandem Repeat (STR) loci using multiplex PCR kits. Maintain strict temperature controls and include appropriate positive and negative amplification controls.
Capillary Electrophoresis: Separate amplified DNA fragments by size and detect fluorescently labeled PCR products. Use internal size standards for precise fragment sizing.
Profile Interpretation: Analyze electrophoregram data to:
- Distinguish true alleles from artifacts (stutter, pull-up)
- Identify mixed profiles with multiple contributors
- Determine the number of contributors based on allele counts and peak height ratios [4]
Statistical Evaluation: Calculate match probabilities or likelihood ratios using validated statistical models and population databases [4].

Quantitative Data and Performance Metrics

The evaluation of forensic biology results given source-level propositions incorporates multiple quantitative measures to assess the strength of evidence [4]. The likelihood ratio framework provides a statistically rigorous method for expressing the probative value of DNA evidence, with the formula:

LR = Pr(E|Hp,I) / Pr(E|Hd,I)

Where E represents the forensic findings, Hp and Hd are the competing propositions, and I represents the case background information [3].

Table 2: Key Performance Metrics for Source-Level DNA Evidence Evaluation

Metric	Calculation Method	Interpretation Guidelines
Match Probability	Frequency of occurrence in relevant population database	Values < 1 in 1 billion provide very strong support for proposition Hp [2]
Likelihood Ratio (LR)	Pr(E\|Hp,I) / Pr(E\|Hd,I)	LR > 1 supports Hp; LR < 1 supports Hd; LR = 1 inconclusive [3]
Stochastic Threshold	Established through validation studies (typically 100-200 pg)	Below this threshold, heterozygote balance and mixture interpretation become unreliable [4]
Peak Height Ratio	(Lower peak / Higher peak) × 100% for heterozygous alleles	Typically 60-80% expected for single source samples; lower ratios may indicate mixtures or degradation [4]
Analytical Threshold	Statistical analysis of background noise (typically 50-100 RFU)	Peaks above threshold are considered true alleles; below threshold are potential artifacts [4]

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Reagents and Materials for Source-Level DNA Analysis

Reagent/Material	Primary Function	Application Notes
Phenolphthalein Reagent	Hemoglobin detection in presumptive blood tests	Catalyzes oxidation of phenolphthalein by peroxide; pink color indicates possible blood [4]
Acid Phosphatase Test Reagents	Semen identification through enzyme activity	Detects prostatic acid phosphatase; purple color development suggests semen presence [4]
Proteinase K	Protein digestion during DNA extraction	Critical for breaking down cellular structures and nucleases that could degrade DNA [4]
Silica-Based Membranes	DNA binding and purification	Selective binding in presence of chaotropic salts; efficient inhibitor removal [4]
STR Multiplex PCR Kits	Simultaneous amplification of multiple loci	Commercial kits typically amplify 20-24 loci plus gender marker; optimized buffer systems [4]
Internal Lane Standards	Fragment size calibration in electrophoresis	Fluorescently labeled size standards mixed with samples for precise allele calling [4]
Population Databases	Statistical calculation of match probabilities	Representative sample databases for appropriate frequency estimates [4]

Logical Framework for Interpretation

The logical framework for interpreting DNA evidence at the source level follows a systematic process that maintains clear distinction between propositions, findings, and case information [7]. This framework is essential for avoiding the transposition of the conditional, a common logical error where the probability of the proposition given the evidence is mistakenly equated with the probability of the evidence given the proposition [7].

The appropriate evaluation of forensic biology results requires careful consideration of all available information, including presumptive test results, body fluid identification, DNA concentrations, and case circumstances [4]. When propositions are formulated at the source level, the focus remains squarely on the biological origin of the material, utilizing the discriminating power of modern DNA profiling systems to address the fundamental question: "Whose DNA is this?" [2].

With modern DNA profiling techniques becoming increasingly sensitive, forensic genetics is undergoing a fundamental paradigm shift. The central question is evolving from "Whose DNA is this?" (a source-level question) to "How did it get there?" (an activity-level question) [2]. This shift is critical because merely being the source of a DNA trace is not punishable by law; legislation penalizes criminal activities [8]. Consequently, there is a growing need for forensic scientists to assist the judiciary in evaluating the strength of DNA evidence given different alleged activities, moving beyond simple profile rarity assessments [5].

The hierarchy of propositions distinguishes between different levels at which forensic evidence can be interpreted: sub-source level (concerned with the DNA donor), source level (concerned with the cellular origin of the DNA), activity level (concerned with how or when a trace was deposited), and offense level (concerned with whether a crime was committed and by whom) [8]. While forensic scientists can directly assist with activity-level propositions, offense-level judgments remain primarily within the domain of the judiciary [8].

Conceptual Framework: Defining Activity-Level Propositions

Activity-level evaluation involves assessing the probability of the forensic findings given competing propositions about specific activities. This evaluation extends beyond the DNA profile itself to include cell type, quantity, location, and distribution of the recovered material [8]. The probative value lies in whether these observations are more likely under one proposed activity than another.

Types of Activity-Level Propositions

A key distinction in formulating propositions is whether they address the activity itself or the actor of the activity [8]. This distinction fundamentally affects the logical structure of the evaluation. The table below outlines this critical difference.

Table 1: Types of Activity-Level Propositions

Proposition Type	Disputed Issue	Example Prosecution Proposition (Hp)	Example Defense Proposition (Hd)
Addressing the Activity	Whether a specific activity occurred	"The person of interest (POI) punched the victim." [2]	"The POI shook hands with the victim." [2]
Addressing the Actor	Who performed a conceded activity	"The POI is the person who climbed the balcony." [8]	"An unknown person is the person who climbed the balcony." [8]

Logical Workflow for Evaluation

The following diagram illustrates the core logical process for evaluating DNA findings given activity-level propositions, incorporating the hierarchy of propositions and case context.

Experimental Protocols and Data Requirements

Moving from source to activity level interpretation requires a formal framework to assess the probability of the evidence given the competing propositions. The likelihood ratio (LR) is the fundamental metric for this evaluation [5]. A general form of the LR for activity-level propositions can be represented as follows:

LR = P(E | Hp, I) / P(E | Hd, I)

Where:

E represents the forensic findings (DNA profile, quantity, location, etc.).
Hp represents the prosecution's activity-level proposition.
Hd represents the defense's activity-level proposition.
I represents the relevant case background information.

Key Probabilistic Factors and Informative Data

To compute the probabilities in the LR, scientists must consider factors beyond DNA profile rarity. The required data, summarized in the table below, primarily inform the denominator P(E | Hd, I), which assesses the probability of the findings under the defense's proposition [2].

Table 2: Key Factors and Data Requirements for Activity-Level Evaluation

Factor	Description	Relevant Experimental Data	Informs which LR component?
Transfer	The probability that an activity deposits a detectable amount of DNA in a specific location.	Controlled experiments mimicking alleged activities (e.g., grabbing, shaking hands) to measure deposited DNA [2].	Primarily *`P(E	Hp, I)`*
Persistence	The probability that DNA remains detectable over time and through environmental exposure.	Studies measuring DNA degradation on various surfaces under different conditions (e.g., temperature, humidity) [5].	Both *`P(E	Hp, I)`* and *`P(E	Hd, I)`*
Background	The probability of finding DNA from an unrelated person (or the POI) on the examined surface.	Prevalence studies measuring DNA levels on surfaces from the general environment (e.g., clothing, skin, objects in public spaces) [2] [5].	Primarily *`P(E	Hd, I)`*
Recovery	The efficiency of the collection and analysis methods in detecting the DNA.	Validation studies of swabbing techniques, extraction kits, and amplification systems for low-level DNA [8].	Both *`P(E	Hp, I)`* and *`P(E	Hd, I)`*

Protocol for Case Assessment and Interpretation

The following workflow provides a detailed methodology for implementing activity-level evaluation in casework, from initial case receipt to final reporting.

The Scientist's Toolkit: Essential Materials and Reagents

Successfully implementing activity-level evaluation requires both conceptual tools and physical reagents to generate the necessary data. The following table details key components of this toolkit.

Table 3: Essential Research Reagent Solutions and Materials for Activity-Level Studies

Item Name	Function/Application	Specifications & Considerations
Mock Casework Samples	Simulate DNA transfer and persistence under controlled conditions.	Use cotton, polyester, or glass substrates. Standardize contact pressure/duration. Donors of known shedder status are critical [2].
Surface Sampling Kits	Recover DNA from various surfaces for background studies.	Include multiple swab types (e.g., cotton, nylon flocked) and moistening agents to optimize recovery from different materials [8].
DNA Quantification Kits	Measure the total human DNA yield from samples.	qPCR-based kits are essential for determining low-level DNA quantities, a key variable in transfer scenarios [8].
Probabilistic Genotyping Software	Interpret complex, low-level, or mixed DNA profiles.	Software based on validated probabilistic models is crucial for accurately determining profile quality and weight for LRs [9] [5].
Bayesian Network Software	Model complex case scenarios and interdependencies between variables.	Graphical software allows for transparent construction of models integrating transfer, persistence, and background probabilities [9].
Population DNA Databases	Inform estimates of background DNA presence and profile rarity.	Representative, high-quality databases are needed to assess the random match probability and the chance of adventitious match [2].

The adoption of activity-level propositions represents a necessary evolution in forensic genetics, aligning scientific practice with the real questions faced by courts. While challenges exist—such as the need for robust data on transfer and persistence, and ongoing training for scientists and legal professionals—the framework for implementation is established and feasible [9] [2]. The continued development of community-wide knowledge bases, standardized experimental protocols, and transparent reporting practices will further solidify the foundation for this critical discipline. By embracing this approach, forensic scientists can provide more focused, relevant, and balanced expert information, ultimately contributing to a more effective criminal justice process.

The interpretation of forensic evidence, particularly DNA, is undergoing a fundamental paradigm shift. For decades, the primary question addressed by forensic science has been one of source-level propositions—essentially, "Does this DNA profile originate from this specific individual?" However, advancements in analytical sensitivity, capable of generating profiles from minute, non-visible trace material, have rendered the issue of source increasingly less contentious and often forensically irrelevant [2]. The mere presence of a person's DNA on an item is no longer synonymous with that person having undertaken a specific criminal activity.

This evolution has created a pressing need for the justice system to address activity-level propositions, which concern "how" and "when" a trace was deposited [10]. This shift moves the focus from "whose DNA is this?" to "how did it get there?" [2]. Activity-level reporting provides a structured, objective framework to evaluate findings in the context of alleged activities, thereby offering more meaningful and helpful assistance to triers-of-fact in judicial proceedings [10]. This application note details the methodologies, protocols, and analytical frameworks essential for implementing robust activity-level evaluations.

Theoretical Foundation: From Source to Activity

The Hierarchy of Propositions

Forensic evidence evaluation operates within a hierarchy of propositions, which frames the specific questions being addressed.

Source Level Propositions: These propositions concern the source of the recovered trace material. The competing propositions are typically of the form: "The trace originated from the person of interest (POI)" versus "The trace originated from an unknown individual" [2]. The evaluation relies heavily on assessing the rarity of the DNA profile in a relevant population.
Activity Level Propositions: These propositions address the activities that led to the deposition of the trace. Competing propositions might be: "The POI punched the victim" versus "The POI shook hands with the victim" [2]. The evaluation must consider a wider set of factors, including transfer, persistence, prevalence, and recovery (TPPR) of DNA, alongside the results of technical analysis.

The Logical Framework of Activity-Level Evaluation

Activity-level evaluation is a Bayesian approach that compares the probability of the forensic findings under two competing propositions: the prosecution's proposition (Hp) and the defense's proposition (Hd). The outcome is a Likelihood Ratio (LR), which quantifies the strength of the evidence for one proposition over the other [11].

The LR is expressed as: LR = Pr(E | Hp, I) / Pr(E | Hd, I) Where:

E = The forensic findings (e.g., the DNA profile, its quality, and quantity).
Hp = The prosecution's activity-level proposition.
Hd = The defense's activity-level proposition.
I = The framework of circumstances of the case.

An LR greater than 1 supports the prosecution's proposition, while an LR less than 1 supports the defense's proposition. The magnitude indicates the strength of that support [6].

Experimental Protocols for Data Generation

Robust activity-level evaluation requires empirical data on the mechanisms of DNA transfer and persistence. Below is a detailed protocol for a transfer study.

Protocol: Direct vs. Indirect DNA Transfer Study

Objective: To generate data on the quantity and quality of DNA transferred through direct contact (e.g., grabbing) versus indirect transfer (e.g., handshake).

1. Reagent and Material Setup:

Participants: Recruit volunteers with varying shedder status (determined by a pre-test).
Substrates: Sterile cotton swabs, DNA/RNA-free water, and standardized items for contact (e.g., plastic bottle necks, knife handles).
Consumables: DNA extraction kits, quantification kits (e.g., qPCR), and PCR amplification kits for STR profiling.
Equipment: Thermal cyclers, genetic analyzers, and software for DNA profile interpretation.

2. Experimental Procedure: 1. Pre-cleaning: Wipe all contact items with a DNA-decontaminating solution and UV-irradiate to eliminate background DNA. 2. Baseline Swab: Swab the palms and fingers of the participant to assess background DNA prior to experiment. 3. Direct Transfer Simulation: - Participant A dons a clean cotton glove for 30 minutes to accumulate DNA. - Participant A directly handles the target item (e.g., bottle) for a defined time (e.g., 10 seconds). - Using a moistened swab, collect DNA from the contacted area of the item. 4. Indirect Transfer Simulation (Handshake): - Participant A dons a clean cotton glove for 30 minutes. - Participant B dons a separate clean cotton glove for 30 minutes. - Participant A and B shake hands for 5 seconds. - Immediately after the handshake, Participant B handles a second, pristine target item for 10 seconds. - Collect DNA from this second item. 5. Control Samples: Collect negative control swabs from unused, pre-cleaned items and positive controls from participant buccal swabs. 6. Sample Processing: Extract DNA from all swabs and controls. Quantify the total human DNA and the male DNA (if applicable) using qPCR. Subject extracts to STR PCR amplification and capillary electrophoresis.

3. Data Analysis:

Profile Assessment: Determine the presence of a clear, single-source profile, a mixed profile, or a partial profile for each sample.
Proportion of DNA: Calculate the relative contribution of the original DNA donor (Participant A) in the profiles obtained from the directly and indirectly handled items.
Data for LR Assignment: The results provide estimates for t, the probability of transferring DNA of a given quantity and quality through direct versus indirect activities. These probabilities can later inform the Pr(E | Hp, I) and Pr(E | Hd, I) terms in an LR calculation for a real case.

Protocol: Assessing Background DNA Prevalence

Objective: To characterize the amount and composition of background DNA on commonly encountered items.

Procedure: Sample a range of items from different environments (e.g., homes, cars, offices) from volunteers who are the primary users of these items. Swab predefined areas and process the samples as in Protocol 3.1. This data provides crucial information on the probability of finding DNA from non-suspects, including co-habitants, on items, which is vital for formulating realistic alternative propositions (Hd) [6].

Implementing the Evaluation: A Case Application

Case Circumstances and Proposition Formulation

Consider a case where a petrol bottle was used in an arson attack. The defendant (POI) was present in the area but claims innocence. The DNA result from the bottle neck is a mixed profile, with the major component matching the POI.

Prosecution Proposition (Hp): The POI filled the bottle with petrol and placed it in the ceiling.
Defense Proposition (Hd): An unknown offender filled the bottle with petrol and placed it in the ceiling, and the POI was present in the toilet as an innocent person [6].

Building a Bayesian Network for Evaluation

A Bayesian Network (BN) is a graphical model that represents the probabilistic relationships between variables. It is an ideal tool for handling the complex dependencies in activity-level evaluations. The following diagram models the key relationships for the case above.

Bayesian Network for DNA Transfer: This model visually represents how the activity (informed by Hp or Hd) affects the probability of DNA transfer from the Person of Interest (POI) and an unknown individual. The final DNA result depends on these transfer events, as well as the presence of background DNA and the potential for laboratory contamination.

Assigning Probabilities and Calculating the LR

Using the BN, probabilities are assigned to each variable based on case information and experimental data (e.g., from Protocols 3.1 and 3.2).

Pr(E | Hp, I): Under Hp, the probability of finding the POI's DNA on the bottle is high (they handled it). The probability of also finding an unknown's DNA is informed by background prevalence data.
Pr(E | Hd, I): Under Hd, the POI is an innocent bystander. The probability of finding their DNA on the bottle must be explained by indirect transfer or background. The probability of finding an unknown's DNA is high (they handled it).

The LR is the ratio of these two probabilities. In a case like R v QUIST, a properly structured evaluation that accounts for common sources of unknown DNA can yield an LR that provides strong, but not misleading, support for the prosecution's proposition [6].

The Scientist's Toolkit: Essential Research Reagents & Materials

Successful activity-level research and implementation depend on specific reagents and methodologies to ensure data robustness and reproducibility.

Table 1: Key Research Reagent Solutions for Activity-Level Studies

Item/Category	Function in Research	Application Example
DNA/RNA-free Consumables (swabs, tubes, water)	To prevent contamination of experimental samples with extraneous DNA, ensuring the integrity of results.	Used in all controlled transfer studies for sample collection and processing [6].
Standardized Substrates (e.g., specific plastic, glass, fabric)	To provide a consistent and forensically relevant surface for DNA transfer and persistence studies.	Used to simulate contact with items like bottle necks, weapons, or clothing [6].
Quantitative PCR (qPCR) Kits	To accurately measure the total quantity of human DNA, and often male DNA, in a sample.	Essential for generating data on the amount of DNA transferred, a key variable in evaluating activity-level propositions.
STR Multiplex PCR Kits	To generate DNA profiles that identify the contributors to a sample, distinguishing between major and minor components.	Used to determine the quality and composition of the transferred DNA (single-source vs. mixture) [6].
Bayesian Network Software (e.g., GeNIe, Hugin)	To construct, populate, and compute complex probabilistic models for evaluating evidence under multiple propositions.	Used to implement the logical framework for calculating Likelihood Ratios in real casework [6].

Legal and Practical Barriers to Adoption

Despite its scientific rigor, the global adoption of activity-level reporting faces significant barriers. A primary challenge is the legal problem of proof in U.S. courts [11] [12]. The evaluative framework requires a defense proposition (Hd). If the defense does not provide one pre-trial, the scientist may need to formulate a "reasonable" Hd. However, U.S. evidence rules (Federal Rules 104(b) and 702) require an adequate foundation for any fact assumed by an expert. A proxy Hd formulated by the prosecution's expert may be deemed speculative and fail the tests of relevance and scientific knowledge, leading to its exclusion [11] [12].

Other barriers include:

Reticence toward suggested methodologies within the forensic community.
A lack of robust and impartial data to inform probabilities for TPPR.
Regional differences in regulatory frameworks and methodology.
Availability of training and resources [10].

The shift from source-level to activity-level propositions is a necessary evolution for forensic science. It addresses the questions actually being asked in modern courtrooms, where the presence of DNA is often a given, but its meaning is anything but. By employing a structured Bayesian framework, supported by empirical data from controlled experiments and implemented through tools like Bayesian networks, forensic scientists can provide courts with transparent, robust, and meaningful evaluations of evidence. While significant legal and practical challenges to global adoption remain, the continued development and validation of these methodologies are critical to improving the credibility and utility of forensic science internationally.

In forensic science, particularly the evaluation of biological evidence such as DNA, the hierarchy of propositions provides a structured framework for formulating and assessing competing hypotheses during casework. This framework distinguishes between different levels of case information, ranging from the general (source level) to the specific (activity level). Source-level propositions focus on the origin of a biological trace, typically addressing questions like "Is the bloodstain from Mr. A?" or "Is the DNA profile from Mr. A?" [2]. In contrast, activity-level propositions address the specific actions or mechanisms that led to the deposition of the trace, such as "Did Mr. A punch the victim?" versus "Did Mr. A merely shake hands with the victim?" [2]. The distinction is critical: while source-level propositions are often a prerequisite for activity-level considerations, they do not directly address the question of how a particular piece of evidence came to be present at a crime scene. With the advent of highly sensitive DNA profiling technologies capable of generating results from tiny, non-visible stainings, the focus is shifting from the question of "whose DNA is this?" to "how did it get there?" [2]. This shift necessitates a deeper understanding of transfer mechanisms and activities to correctly interpret the probative value of forensic findings.

Conceptual Framework and Definitions

Source Identity (Source-Level Propositions)

Source identity, or source-level propositions, concerns the origin of a biological trace. It seeks to identify the individual from whom a specific piece of biological material originated [2]. This level can be further subdivided into source and sub-source levels [13]. The source level specifies the particular biological material from which a DNA profile was obtained, such as blood or saliva [4] [13]. The sub-source level, a product of modern, highly sensitive analytical techniques, refers more generally to the source of a DNA profile, even when no specific biological fluid can be identified [13]. For example, a sub-source level proposition would be: "The DNA on the clothing came from the defendant" versus "The DNA on the clothing came from someone else" [13]. Evaluating evidence given source-level propositions commonly involves assessing the rarity of the DNA profile in a relevant population and is often perceived as more straightforward because it relies on well-established population databases and statistical models [2].

Transfer Mechanisms and Activities (Activity-Level Propositions)

Activity-level propositions move beyond the question of source to investigate the actions and mechanisms associated with the evidence. This involves evaluating how a trace was transferred and persisted on a surface or item, requiring consideration of complex factors beyond mere source identity [2]. Key concepts include:

Transfer Mechanisms: The processes by which biological material is moved from one surface to another (e.g., through direct contact, airborne transfer, or secondary transfer).
Persistence: The duration for which the transferred material remains detectable on a surface.
Background Prevalence: The level of naturally occurring or environmentally present DNA or other biological material on surfaces, which is unrelated to the alleged activity [2].

Evaluating evidence given activity-level propositions is inherently more complex. It requires a logical framework, such as a Bayesian network or Chain Event Graph (CEG), to combine probabilities related to transfer, persistence, and background levels for activities such as punching, grabbing, or sitting on a car seat [2] [14]. For instance, in a drug trafficking case involving banknotes with drug traces, a CEG can model various storylines of how the notes became contaminated to evaluate the support for competing activity-level propositions offered by the prosecution and defense [14].

Comparative Analysis: Key Differences

The table below summarizes the core differences in focus between source identity and transfer mechanisms/activities.

Table 1: Key Differences Between Source-Level and Activity-Level Propositions

Aspect	Source Identity (Source-Level)	Transfer Mechanisms & Activities (Activity-Level)
Core Question	"Whose DNA is this?" or "Who is the source of this biological material?" [2]	"How did the DNA get there?" or "What activity caused the transfer?" [2]
Focus of Evaluation	Identity of the individual who is the source of the trace [2].	The nature of the activity and the mechanisms of transfer, persistence, and background prevalence [2].
Typical Propositions	"The bloodstain came from Mr. A." vs. "The bloodstain came from an unknown person." [2]	"Mr. A punched the victim." vs. "Mr. A shook hands with the victim." [2]
Key Input Data	Rarity of the DNA profile in a population [2].	Probabilities of transfer, persistence, and background levels of DNA, informed by case circumstances and experimental data [2].
Complexity & Tools	Relatively straightforward; relies on population statistics and profile rarity calculations [2].	High complexity; often requires Bayesian Networks or Chain Event Graphs to model scenarios and compute likelihood ratios [2] [14].
Level of Contestation	Often not disputed in court with reliable DNA profiling [2].	Frequently the central disputed issue in a case [2].

Experimental Protocols for Key Analyses

Protocol for Evaluating Source-Level Propositions (DNA Profile)

This protocol outlines the standard methodology for evaluating DNA evidence given source-level propositions.

1. Sample Collection and DNA Extraction:

Collect the biological sample from the crime scene item using a sterile swab or by cutting the material.
Extract DNA using a validated commercial extraction kit, following the manufacturer's protocol. Include positive and negative controls to monitor the process.

2. DNA Quantification and Amplification:

Quantify the extracted DNA using a quantitative PCR (qPCR) method to determine the concentration of human DNA and the presence of PCR inhibitors.
Amplify the DNA sample using a multiplex STR (Short Tandem Repeat) PCR amplification kit targeting a standard set of genetic markers (e.g., 20-24 loci).

3. Genetic Profiling and Analysis:

Separate the amplified DNA fragments by capillary electrophoresis.
Analyze the resulting electropherogram using specialized software (e.g., STRmix) to interpret the DNA profile, considering potential mixtures, stutter, and peak height balance [4].

4. Statistical Evaluation and Reporting:

Calculate the Likelihood Ratio (LR) by comparing the probability of the DNA evidence under the prosecution proposition (the trace came from the person of interest) and the defense proposition (the trace came from an unknown person from the relevant population).
The LR is computed based on the profile's rarity: ( LR = \frac{P(E|Hp)}{P(E|Hd)} ), where ( P(E|Hp) ) is typically 1, and ( P(E|Hd) ) is the profile frequency in the population [2].
Report the LR along with a verbal equivalent according to established scales.

Protocol for Designing an Activity-Level Transfer Study

This protocol describes how to design experiments to generate data for evaluating activity-level propositions, using a DNA transfer scenario as an exemplar.

1. Define the Activity and Variables:

Clearly define the activity to be studied (e.g., direct grabbing vs. indirect handshake).
Identify key variables that may influence transfer, such as: shedder status of the individual, type and duration of contact, surface type, and time since last hand washing [2].

2. Experimental Setup and Simulation:

Recruit participants with varying shedder status (determined by a pre-test).
Simulate the alleged activity (e.g., grabbing a fabric sleeve) and the alternative activity (e.g., a brief handshake) under controlled conditions.
For each simulation, control for variables like pressure and duration, or incorporate a range of values to study their effect.

3. Sample Collection and DNA Profiling:

After the activity, sample the contact surface (e.g., the sleeve) using a standardized swabbing technique.
Process the samples through the standard DNA analysis protocol (see Section 4.1) to obtain DNA profiles and quantities.

4. Data Analysis and Probability Assignment:

Analyze the results to determine the probability of detecting a DNA profile (and its characteristics, such as DNA quantity or mixture proportion) given that the specific activity occurred.
This data is used to inform the probabilities ( P(E|Hp) ) and ( P(E|Hd) ) in the LR formula, where the propositions ( Hp ) and ( Hd ) are now activity-level propositions.
The evaluation must also consider the probability of background DNA on the surface unrelated to the activity.

Visualization of Logical Frameworks

Bayesian Network for Activity-Level Evaluation

The following diagram, generated using Graphviz, illustrates a simplified Bayesian network for evaluating DNA evidence given activity-level propositions. This model incorporates the key concepts of activity, transfer, persistence, and background DNA.

Title: Bayesian Network for Activity Evaluation

Chain Event Graph for Activity-Level Scenarios

Chain Event Graphs (CEGs) are particularly useful for modeling asymmetric, time-ordered activity scenarios. The diagram below, generated using Graphviz, represents a simplified CEG for a drug trace on banknotes case, showing different storylines proposed by prosecution and defense.

Title: CEG for Drug Trace Scenarios

The Scientist's Toolkit: Essential Research Reagents and Materials

The following table details key reagents, software, and materials essential for conducting research and evaluation across both source and activity-level propositions.

Table 2: Essential Research Reagents and Materials for Forensic Evidence Evaluation

Item Name	Type	Primary Function
STR Multiplex PCR Kits	Reagent	Simultaneously amplifies multiple Short Tandem Repeat (STR) loci from a DNA sample to generate a genetic profile for source identification [4].
qPCR Quantification Kits	Reagent	Determines the quantity and quality of human DNA in an extract and checks for the presence of PCR inhibitors, which is crucial for reliable profiling [4].
Presumptive Test Reagents	Reagent	Provides a preliminary, though not definitive, indication of the presence of a specific body fluid (e.g., blood, saliva) to inform source-level propositions [4].
Bayesian Network Software	Software	A graphical probabilistic framework (e.g., Hugin, Netica) used to build complex models for evaluating evidence given activity-level propositions, accounting for transfer, persistence, and background [2].
STRmix	Software	A probabilistic genotyping software used to interpret complex DNA profiles, including mixtures, which is a fundamental tool for computing LRs at the sub-source level [4].
Chain Event Graph (CEG) Framework	Analytical Framework	A graphical model for representing asymmetric, time-ordered event sequences, ideal for comparing competing activity-level narratives proposed in a criminal case [14].

Frameworks and Tools for Implementing Activity-Level Evaluations

The Likelihood Ratio (LR) as the Core Logical Framework for Evaluation

The Likelihood Ratio (LR) is a powerful statistical tool derived from Bayes' theorem that quantifies how much a specific test result will change the odds of a condition, such as a disease or the activity level of a drug target [15]. First conceptualized in the 18th century by Reverend Thomas Bayes, this framework has become a cornerstone for interpreting diagnostic test results in both clinical medicine and pharmaceutical research [15]. The LR provides a unified logical framework for evaluating evidence, making it particularly valuable for assessing source level versus activity level propositions in drug development.

In the context of pharmaceutical research, the LR enables researchers to move from qualitative assessments to quantitative evidential interpretation. For activity level propositions, it answers the crucial question: "How many times more likely is this experimental result to be observed if the drug compound is actively engaging the intended biological target versus if it is not?" This framework is especially critical in early-stage drug discovery where researchers must prioritize lead compounds from thousands of potential candidates [16].

Theoretical Foundation and Mathematical Formulation

Bayesian Framework for Proposition Evaluation

The Likelihood Ratio operates within a Bayesian framework, providing a mathematical structure for updating beliefs based on new evidence. The fundamental equation for the LR when a test result equals a specific value r is expressed as:

LR(r) = P(x = r | D+) / P(x = r | D–) [15]

Where:

P(x = r | D+) represents the probability of observing test result r when the condition is truly present (e.g., drug activity exists)
P(x = r | D–) represents the probability of observing test result r when the condition is truly absent (e.g., no drug activity)

This formulation allows researchers to quantify the strength of evidence provided by experimental results, bridging the gap between statistical analysis and practical decision-making in pharmaceutical development.

Likelihood Ratio Variations for Different Data Types

The application of LRs extends beyond simple binary outcomes to accommodate various data structures encountered in drug discovery research:

LR(+): For positive test results (values equal to or beyond a cut-off)
LR(–): For negative test results (values below a cut-off)
LR(Δ): For a range of test values (particularly useful for continuous assay data) [15]

Each variation provides a method for evidence weighting appropriate to different experimental contexts, from high-throughput screening to detailed mechanistic studies.

Quantitative Data Presentation

Table 1: Likelihood Ratio Types and Their Applications in Drug Development

LR Type	Definition	Graphical Representation	Drug Development Application
LR(r) for specific test value	Probability of observing value r in active vs. inactive compounds	Slope of tangent to ROC curve at point corresponding to r	High-resolution compound potency assessment
LR(+) for positive results	Probability of positive test in active vs. inactive compounds	Slope of line segment from origin to ROC point	Primary screening hit identification
LR(–) for negative results	Probability of negative test in active vs. inactive compounds	Slope of line segment from ROC point to upper-right corner	Exclusion of inactive compounds
LR(Δ) for value ranges	Probability of values within range in active vs. inactive compounds	Slope of line segment between two points on ROC curve	Potency range categorization for lead optimization

Table 2: Likelihood Ratio Interpretation Guide for Activity Level Assessment

LR Value	Strength of Evidence	Impact on Activity Probability	Development Decision
>10	Strong evidence for activity	Large increase	Progress to lead optimization
5-10	Moderate evidence for activity	Moderate increase	Further mechanistic studies
2-5	Weak evidence for activity	Slight increase	Secondary screening
1-2	Minimal evidence	Almost no change	Consider discarding
<1	Evidence against activity	Decreases probability	Discard compound

Experimental Protocols for LR Determination

Protocol 1: Establishing Likelihood Ratios for High-Throughput Screening

Purpose: To implement LR framework for prioritizing hits from high-throughput screening (HTS) campaigns in early drug discovery.

Materials:

Compound library (e.g., ZINC natural compound database) [16]
Validated biological assay system
Automated screening platform
Data analysis software with Bayesian statistical capabilities

Methodology:

Assay Validation: Establish reference datasets with known active and inactive compounds to characterize assay performance parameters.
Screening Execution: Conduct primary screening of compound library (e.g., 89,399 compounds as in recent tubulin inhibitor study) [16].
Data Segmentation: Divide results into four categories:
- True Positives (TP): Known active compounds showing activity
- False Positives (FP): Known inactive compounds showing activity
- True Negatives (TN): Known inactive compounds showing no activity
- False Negitives (FN): Known active compounds showing no activity
LR Calculation: Compute likelihood ratios for various levels of activity using the formula:
- LR = (TP / (TP + FN)) / (FP / (FP + TN))
Hit Prioritization: Apply LR thresholds to identify compounds for confirmation screening, typically using LR > 5 as cutoff for moderate evidence of activity.

Validation: Confirm activity of prioritized hits using orthogonal assay methods with established LR parameters.

Protocol 2: Machine Learning-Enhanced LR for Virtual Screening

Purpose: To integrate machine learning with LR framework for improved prediction of compound activity in structure-based drug design.

Materials:

Chemical compound libraries in appropriate format (e.g., SDF, PDBQT) [16]
Molecular descriptor software (e.g., PaDEL-Descriptor) [16]
Machine learning classifiers (e.g., random forest, SVM)
Structure-based virtual screening platform (e.g., AutoDock Vina) [16]

Methodology:

Training Dataset Preparation:
- Curate known active compounds targeting specific site (e.g., Taxol site for tubulin) [16]
- Collect inactive compounds with similar physicochemical properties but different topologies using DUD-E server [16]
- Generate molecular descriptors using PaDEL-Descriptor software [16]

Model Training:
- Implement 5-fold cross-validation with multiple ML algorithms
- Calculate performance indices: precision, recall, F-score, accuracy, Matthews Correlation Coefficient (MCC), and Area Under Curve (AUC) [16]
- Select best-performing model based on validation metrics
Virtual Screening with LR Integration:
- Perform structure-based virtual screening of compound library
- Generate initial hits based on binding energy (e.g., top 1,000 compounds) [16]
- Apply trained ML classifier to identify active compounds
- Calculate compound-specific LRs based on classifier probability scores
Experimental Validation:
- Select top candidates based on LR values for in vitro testing
- Iteratively refine ML models and LR calculations based on experimental results

Applications: This protocol was successfully implemented in identifying natural inhibitors against human αβIII tubulin isotype, narrowing 89,399 compounds to 20 active candidates with exceptional ADME-T properties [16].

Visualizing LR Workflows in Drug Discovery

Diagram 1: LR Assessment Workflow for Compound Screening. This workflow illustrates the systematic process from data collection to development decisions using the LR framework.

Diagram 2: Bayesian Framework for Activity Assessment. This diagram shows the relationship between prior probability, experimental evidence, and posterior probability through the LR framework.

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Research Reagents for LR-Based Compound Evaluation

Reagent/Resource	Function in LR Framework	Application Context
ZINC Compound Database	Source of natural compounds for screening libraries	Virtual screening and initial activity assessment [16]
PaDEL-Descriptor Software	Generates molecular descriptors for machine learning	Converting chemical structures to numerical data for ML-based LR [16]
AutoDock Vina	Structure-based virtual screening platform	Initial compound prioritization based on binding energy [16]
DUD-E Server	Generates decoy compounds with similar properties	Creating negative controls for ML training datasets [16]
Modeller	Homology modeling of protein structures	Creating 3D models when experimental structures unavailable [16]
Directory of Useful Decoys (DUD-E)	Generates decoy molecules for validation	Creating reliable negative datasets for machine learning [16]

Application in AI-Driven Drug Discovery

The integration of artificial intelligence (AI) in drug discovery has created new opportunities for implementing the LR framework at scale. AI algorithms can process vast chemical spaces and optimize clinical trials, enhancing the precision of LRs for activity level propositions [17]. Machine learning approaches, particularly supervised learning based on chemical descriptor properties, enable differentiation between active and inactive molecules, providing robust probability estimates for LR calculations [16].

In recent applications, AI-driven screening strategies have successfully identified novel anticancer drugs by combining large databases with manually curated information to describe therapeutic patterns between compounds and diseases [18]. These approaches leverage the LR framework to prioritize compounds for further investigation, significantly accelerating the drug discovery process while reducing costs [17].

The transformative potential of AI in drug discovery is exemplified by platforms like AlphaFold for protein structure prediction and AtomNet for structure-based drug design, which provide the structural insights necessary for accurate LR calculations in target validation and compound optimization [17]. These technologies enable researchers to move beyond simple activity assessment to nuanced evaluation of activity level propositions based on comprehensive structural and biochemical data.

Incorporating Transfer, Persistence, Prevalence, and Recovery (TPPR) Mechanisms

The forensic science community increasingly addresses questions not just about the source of DNA, but about the activities that led to its deposition. This shift necessitates a deep understanding of the mechanisms of Transfer, Persistence, Prevalence, and Recovery (TPPR). TPPR provides the critical framework for evaluating forensic findings given activity-level propositions, which ask how and when DNA was deposited on a surface or item [10] [19]. While source-level propositions seek to identify the donor of a DNA sample, activity-level propositions aim to reconstruct the events that caused the transfer, making TPPR data indispensable for the interpretation of complex evidence scenarios [20] [10].

Advancements in DNA profiling technologies, capable of generating profiles from minuscule quantities of biological material, have heightened the importance of TPPR. Modern techniques can produce full profiles from what is termed 'touch DNA' or from a few cells [20] [19]. However, this high sensitivity also introduces complexity, as DNA can be transferred through various direct and indirect routes, and its presence on a surface is influenced by a multitude of factors. Consequently, the forensic scientist must be equipped to assess the likelihood of finding a DNA profile given different activity scenarios, for which a robust understanding of TPPR is foundational [19].

Core TPPR Concepts and Definitions

The TPPR framework breaks down the life cycle of DNA evidence from deposition to collection and analysis. Each component addresses a specific stage in this process, and together they provide a structured approach for evaluating evidence given activity-level propositions.

Transfer: This refers to the movement of DNA from a source to a recipient surface. Transfer can be primary (direct), where DNA moves directly from an individual to an object or another person, or secondary (indirect), where DNA is transferred via an intermediate vector. Understanding transfer involves considering factors like the shedder status of an individual, the nature of the contacting surfaces, the pressure and duration of contact, and the presence of biological fluids [20] [19].
Persistence: Persistence describes the survival of deposited DNA on a surface over time, from the moment of deposition until the moment of collection. DNA is a fragile molecule, and its persistence is affected by environmental conditions (e.g., heat, humidity, UV light), the nature of the substrate, and post-deposition activities that may degrade or remove the sample, such as cleaning, weathering, or subsequent handling [20] [19].
Prevalence: This concept pertains to the background levels of DNA, both from the individual of interest and from other unknown sources, that exist on a surface or item prior to the activity under investigation. The prevalence of non-self DNA on hands, personal items, and clothing is a key consideration, as it represents an inherent background that can complicate the interpretation of a DNA profile recovered from that surface [20] [19].
Recovery: Recovery encompasses the methods and efficiency of collecting, preserving, and processing DNA evidence from a surface. The choice of sampling technique (e.g., swabbing, tape-lifting), the skill of the practitioner, and the subsequent laboratory processes all influence the quantity and quality of DNA that is ultimately profiled [20].

The diagram below illustrates the interconnected stages of the TPPR framework and the key factors influencing each stage.

Empirical data is essential for applying the TPPR framework. The following tables summarize key quantitative findings from research, which can be used to inform probabilities in evaluative reporting.

Table 1: DNA Recovery from Body Areas Following Mock Assault (Skin-to-Skin Contact)

Body Area Sampled	Sampling Method	Key Finding	Reference
Forearms	Double-swabbing (wet then dry)	Recovered ~13.7% more offender DNA than other methods	[20]
Forearms	Single swab (various movements)	Less effective than double-swabbing for offender DNA recovery	[20]
Forearms	Tape Lifting	Less effective than double-swabbing for offender DNA recovery	[20]

Table 2: Factors Influencing DNA Transfer and Persistence

Factor	Influence on DNA-TPPR	Research Need / Note
Shedder Status	Inter-individual variation significantly impacts the amount of DNA deposited.	Need to understand underlying genetic and non-genetic properties.
Background DNA Prevalence	Non-self DNA is commonly found on hands, personal items, and clothing.	Need for more data on prevalence on bodies of children and adults, and in shared spaces.
Substrate Properties	Surface topography, chemical composition, and fiber type affect transfer and persistence.	More research needed on a wider array of forensically relevant substrates.
Handwashing	Reduces DNA quantity on hands, but DNA re-accumulates post-washing.	Impact of different washing methods and personal habits requires further study.

Experimental Protocols for TPPR Research

Robust experimental protocols are required to generate high-quality TPPR data. The following section details a core methodology for studying DNA transfer and recovery from skin surfaces.

Protocol: DNA Recovery from Skin Following Direct Contact

This protocol is adapted from studies investigating the recovery of foreign DNA from a victim's skin following a mock assault scenario [20].

1.0 Objective: To evaluate and compare the efficiency of different sampling methods for recovering non-self DNA deposited on human skin via direct contact.

2.0 Experimental Design:

Participants: A controlled, mock assault scenario is set up with one participant designated as the "offender" and another as the "victim."
Contact: The offender firmly grips the forearm of the victim for a defined period (e.g., 10-30 seconds) to simulate physical contact.
Variables: The primary variable tested is the DNA recovery method. Other controlled factors include contact duration, pressure, and the skin area contacted.

3.0 Materials:

Sterile Swabs: Cotton-tipped (e.g., Puritan Cap-Shure) or nylon-flocked (e.g., Copan FLOQ) swabs.
Distilled Water: For moistening swabs.
Evidence Collection Tubes: Sterile tubes for swab storage and transport.
Tape Lifts: (e.g., SceneSafe FAST minitape) if included in the comparison.

4.0 Step-by-Step Procedure:

Pre-Sampling Consent & Documentation: Obtain ethical consent from all participants. Document the specific skin area to be sampled on the victim.
Sampling Method Application: Apply the different sampling methods to adjacent, equivalent areas on the victim's forearm that were contacted by the offender.
- A. Double-Swabbing Technique:
  - Moisten the first swab with distilled water.
  - Vigorously rub the targeted skin area with the wet swab, while simultaneously rotating the swab to maximize surface contact.
  - Air-dry the first swab completely.
  - Using a second, dry swab, repeat the rubbing and rotation over the same skin area.
  - Air-dry the second swab and package both swabs together.
- B. Single-Swab Techniques (for comparison):
  - Test variations such as a wet swab with a rolling motion only, a dry swab with a rubbing motion, etc.
Tape Lifting (if applicable):
- Firmly press the adhesive tape onto the targeted skin area.
- Peel the tape off and adhere it to a clean, clear plastic sheet for storage.
Post-Collection Handling:
- Place all collected samples (swabs, tape lifts) in their respective evidence containers.
- Label containers clearly with sample ID, date, time, and collector's initials.
- Store samples appropriately before submission to the laboratory for DNA extraction and profiling.

5.0 Data Analysis:

Extract and quantify DNA from all samples.
Generate DNA profiles and record the percentage of the offender's profile detected in the mixture from the victim's skin.
Statistically compare the yield of offender DNA across the different sampling methods (e.g., double-swabbing vs. single swab vs. tape lift).

Workflow Visualization

The experimental workflow for the DNA recovery protocol is outlined below.

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Materials for DNA TPPR Experiments from Skin

Item	Function / Application	Example Product(s)
Cotton Swabs	The most common tool for sample collection from skin and surfaces. Can be used dry or wet.	Puritan Cap-Shure Sterile Cotton Swabs [20]
Nylon Flocked Swabs	Swabs with a perpendicular nylon fiber tip designed to release more collected biological material during extraction.	Copan FLOQSwabs [20]
Tape Lifts	An alternative collection method using adhesive to lift cellular material from a surface.	SceneSafe FAST Minitape [20]
Distilled Water	Used to moisten swabs to increase the efficiency of cell collection from dry surfaces.	N/A

Application in Activity-Level Evaluations

Integrating TPPR data allows forensic scientists to build Bayesian networks for evaluating evidence given activity-level propositions. This structured approach forces the consideration of alternative scenarios and uses TPPR data to assign probabilities to the findings under each scenario [21]. For instance, a Bayesian network can model the probability of detecting a suspect's DNA on a victim's neck given the proposition of a strangle attack versus the proposition of an innocent social interaction. The probabilities within the network would be informed by TPPR data on transfer during gripping, persistence over time, and background prevalence of non-self DNA on skin [20] [21].

Despite its importance, the global adoption of evaluative reporting using TPPR is hampered by barriers such as a lack of robust and impartial data for all relevant scenarios, regional differences in methodology, and variable availability of training [10]. Continued research to fill data gaps, particularly on the persistence of deposits over long periods and the prevalence of background DNA in a wider range of scenarios, is critical to strengthening this scientific framework [19].

Bayesian Networks for Modeling Complex Activity Scenarios

Bayesian Networks (BNs) provide a powerful framework for representing and reasoning about complex activity scenarios under uncertainty. A BN consists of a directed acyclic graph (DAG) and a set of local distributions, where each node in the graph represents a random variable denoting an attribute, feature, or hypothesis about which we may be uncertain [22]. Each random variable has a set of mutually exclusive and collectively exhaustive possible values. The graph represents direct qualitative dependence relationships, while the local distributions represent quantitative information about the strength of those dependencies [22]. Together, the graph and local distributions represent a joint distribution over the random variables denoted by the nodes of the graph.

This application note explores how BNs enable researchers to model intricate relationships in complex activity scenarios, with particular emphasis on their application within pharmaceutical research and forensic science. The hierarchical modeling capability of BNs makes them particularly valuable for distinguishing between source-level and activity-level propositions—a crucial distinction in both drug development and forensic evidence evaluation. Activity-level propositions address questions of "How did this occur?" rather than simply "What is this?" allowing for more nuanced interpretative frameworks [23].

Theoretical Framework and Relevance to Activity Propositions

Hierarchical Proposition Framework

Within evidentiary reasoning, a critical distinction exists between source-level and activity-level propositions. Source-level propositions concern the origin of biological material (e.g., "Does this DNA come from suspect X?"), while activity-level propositions address the mechanisms that led to the transfer and presence of that material (e.g., "How did the suspect's DNA get on this item?") [23]. Bayesian Networks provide an ideal mathematical structure for navigating this hierarchy because the value of evidence calculated at one level cannot be automatically carried over to another level—each requires separate computational consideration [23].

The power of BNs lies in their ability to perform bi-directional belief updating through algorithms such as Pearl's propagation method [22]. When new evidence is introduced, information flows through the network via π vectors (prior evidence) and λ vectors (likelihood evidence), updating node beliefs according to Bayesian principles. This updating mechanism allows activity-level scenarios to be refined as new experimental or observational data becomes available, making BNs particularly valuable for iterative research processes.

Modeling Complex Activities with Temporal Dependencies

Complex activity recognition must address not only primitive events but also their rich temporal dependencies and the inherent variability in how individuals perform the same activity [24]. A Bayesian network-based probabilistic generative framework can characterize these structural variabilities by incorporating Allen's temporal interval relations [24]. This approach can describe 13 interval-based relations between any pair of primitive events, managing multiple occurrences of the same primitive events and variable sizes of primitive events in complex activities through the Chinese Restaurant Process [24].

Table 1: Comparison of Bayesian Methods in Pharmaceutical Research

Method	Primary Application	Key Advantages	Data Types Integrated
BANDIT [25]	Drug target identification	~90% accuracy; integrates multiple data types	Drug efficacies, transcriptional responses, drug structures, adverse effects, bioassay results, known targets
WBCP [26]	Drug combination prediction	Superior AUC, accuracy, precision, and recall	ATC codes, SMILES structures, target sequences, GO terms, KEGG pathways, side effects
Generative Probabilistic Model [24]	Complex activity recognition	Handles temporal relational variabilities	Primitive events, temporal intervals, activity sequences

Applications in Pharmaceutical Research

Drug Target Identification Using BANDIT

The BANDIT (Bayesian ANalysis to determine Drug Interaction Targets) platform demonstrates the power of Bayesian approaches for drug target identification. This method integrates over 20,000,000 data points from six distinct data types: drug efficacies, post-treatment transcriptional responses, drug structures, reported adverse effects, bioassay results, and known targets [25]. For each data type, similarity scores are calculated for drug pairs, which are then converted into likelihood ratios and combined to produce a total likelihood ratio (TLR) that indicates the probability of two drugs sharing a target [25].

The BANDIT framework achieves approximately 90% accuracy in identifying shared target interactions across 2,000+ small molecules [25]. Its application to 14,000+ compounds without known targets generated approximately 4,000 previously unknown molecule-target predictions. Researchers validated 14 novel microtubule inhibitors from this set, including three with activity on resistant cancer cells [25]. Importantly, BANDIT successfully identified DRD2 as the target of ONC201—an anti-cancer compound in clinical development whose target had remained elusive—enabling more precise clinical trial design [25].

Drug Combination Prediction with Weighted Bayesian Integration

The WBCP method exemplifies advances in Bayesian approaches for predicting effective drug combinations. This weighted Bayesian integration method constructs a multiplex drug similarity network from seven types of drug similarity data: ATC similarity, SMILES structure similarity, target protein sequence similarity, GO semantic similarity, KEGG pathway similarity, SIDER side effects similarity, and OFFSIDES drug effects similarity [26]. The method formulates features for drug pairs by computing similarities between query drug pairs and all known drug combinations, then uses the maximum similarity value as a feature [26].

Unlike traditional Naive Bayes approaches that assume attribute independence, WBCP implements a Bayesian model with attribute weighting applied to the likelihood ratios of features [26]. This generates a support strength score (0-1), where higher scores indicate greater support for the drug pair belonging to the drug combination class. When comprehensively compared with other methods, WBCP demonstrates superior performance across multiple metrics, including Area Under the Receiver Operating Characteristic Curve, accuracy, precision, and recall [26].

Table 2: Drug Similarity Networks in WBCP Method

Similarity Type	Data Source	Calculation Method	Biological Interpretation
ATC Similarity [26]	DrugBank database	Cosine similarity of IDF-weighted vectors	Therapeutic, pharmacological, and chemical characteristics
SMILES Similarity [26]	DrugBank database	Tanimoto coefficient of atom pairs	Chemical structural similarity
Target Sequence [26]	Uniprot database	Sequence descriptors and similarity	Similarity of drug target proteins
GO Semantic [26]	Uniprot database	Jaccard's coefficient of GO terms	Biological process similarity
KEGG Pathway [26]	KEGG database	Jaccard's coefficient of pathways	Pathway involvement similarity
Side Effects [26]	SIDER database	Jaccard's coefficient of side effects	Adverse reaction profile similarity

Experimental Protocols

Protocol 1: Bayesian Network Development for Activity Recognition

Objective: Construct a Bayesian network for complex activity recognition using primitive events and their temporal dependencies.

Materials:

Primitive event data with timestamps
Computational resources for model training
Domain knowledge for network structure initialization

Procedure:

Define Primitive Events: Identify and label low-level activities that cannot be further decomposed under application semantics [24]. Each primitive event should include start-time (t-) and end-time (t+) notation: I = (t-, A, t+), where A represents the low-level activity [24].

Establish Temporal Relations: Apply Allen's temporal interval algebra to characterize the possible relations between primitive events (precedes, follows, equals, during, etc.) [24].
Network Structure Generation: Implement the Chinese Restaurant Process (CRP) to generate tables containing unique sets of primitive events with their corresponding temporal relations [24]. Each table characterizes a particular style or cluster of the complex activity.
Parameter Learning: Estimate conditional probability distributions for nodes using available training data. Incorporate domain knowledge where data is sparse.
Model Validation: Evaluate recognition accuracy using k-fold cross-validation, measuring performance metrics such as AUC-ROC and precision-recall curves [25].

Troubleshooting Tips:

If model performance is poor, consider extending tables to include several possible styles of temporal dependencies between primitive events [24].
For missing primitive event data, implement probabilistic reasoning to maintain inference capabilities [24].

Protocol 2: Drug Target Identification Using BANDIT

Objective: Identify potential drug targets for orphan compounds using the BANDIT Bayesian framework.

Materials:

Compound libraries with structural information
Drug efficacy data (e.g., NCI-60 growth inhibition screens)
Transcriptional response databases
Adverse event reporting systems
Bioassay results databases
Known drug-target interaction databases

Procedure:

Data Collection: Gather the six required data types for both reference drugs (with known targets) and orphan compounds [25].

Similarity Calculation: For each data type, compute similarity scores between all drug pairs using appropriate metrics for each data modality [25].
Likelihood Ratio Conversion: Convert individual similarity scores into distinct likelihood ratios using the distributions of shared-target versus non-shared-target drug pairs [25].
Bayesian Integration: Combine individual likelihood ratios to obtain a Total Likelihood Ratio (TLR) using Bayesian methods [25].
Target Prediction: Apply voting algorithm to identify specific binding targets by detecting recurring targets across high-TLR shared-target predictions [25].
Experimental Validation: Prioritize predicted targets for experimental confirmation using kinase inhibition assays or other relevant biological assays [25].

Validation Metrics:

Calculate Area Under the Receiver Operating Characteristic Curve (AUROC) using 5-fold cross-validation [25].
Compute precision-recall curves to assess prediction quality across different threshold settings [25].

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Research Materials for Bayesian Network Applications

Resource	Function	Application Context
DrugBank Database [26]	Source of ATC codes and SMILES structures	Drug similarity network construction
Uniprot Database [26]	Provides protein sequence data and GO terms	Target similarity calculations
KEGG Pathway Database [26]	Curated pathway information	Pathway similarity analysis
SIDER Database [26]	Drug side effect information	Adverse effect similarity profiling
NCI-60 Screening Data [25]	Drug efficacy patterns across cancer cell lines	Growth inhibition similarity scoring
Connectivity Map (CMap) [25]	Transcriptional response profiles	Gene expression similarity analysis
Pharmaceutical Company Databases [27]	Proprietary bioassay results and known targets	Industry drug development applications

Bayesian Networks provide a mathematically rigorous framework for modeling complex activity scenarios across diverse research domains. Their ability to explicitly handle uncertainty, incorporate multiple data types, and adapt to evolving evidence makes them particularly valuable for addressing the challenges of activity-level proposition evaluation. The continued development of Bayesian methodologies, as demonstrated by BANDIT for drug target identification and WBCP for drug combination prediction, promises to enhance research efficiency and decision-making in both pharmaceutical development and forensic science. As these fields continue to generate increasingly complex and multidimensional data, Bayesian Networks offer a principled approach for integrating this information into coherent analytical frameworks that respect the hierarchical nature of evidentiary reasoning.

Chain Event Graphs (CEGs) for Asymmetric and Time-Ordered Propositions

Chain Event Graphs (CEGs) are a class of probabilistic graphical models that offer a powerful framework for representing processes where events unfold asymmetrically over time. Unlike Bayesian Networks (BNs), which require symmetric variable states and can obscure temporal sequences, CEGs are derived from event trees and directly depict the possible pathways of a process, making them exceptionally suited for modeling real-world scenarios in reliability analysis, forensic science, and system diagnostics [28] [14]. Their topology explicitly represents context-specific dependencies and the partial temporal order of events, which are often intrinsic to causal hypotheses [28].

Within the context of a thesis comparing source-level and activity-level propositions, CEGs provide a formal mechanism to distinguish between these levels of analysis. Source-level propositions typically concern the origin of evidence (e.g., whether a particular component caused a system failure), while activity-level propositions involve inferring the sequence of actions or events that led to the observed evidence (e.g., the specific chain of failures that resulted in a system fault) [14]. The CEG's structure, built from root-to-leaf paths in an event tree, is inherently designed to model and evaluate these complex, asymmetric activity-level sequences, offering a transparent rationale for predictive inferences about system behavior under various intervention regimes [28].

Theoretical Foundations and Comparative Advantages

Construction of a Chain Event Graph

The construction of a CEG begins with a finite event tree, which maps out all possible sequences of events in a process. The vertex set (VT) contains all vertices, with the root vertex (v0) representing the start of the process and leaf vertices (LT \subset VT) representing terminal outcomes. The non-leaf vertices are called situations, denoted (ST = VT \setminus LT). Each directed edge (e{v,v'}) in the edge set (E_T) represents a transition from situation (v) to a child situation (v') in (ch(v)) [28].

A probability tree is formed when each edge (e{v,v'}) is assigned a transition probability (\theta{v,v'}), such that for every situation (v), the vector (\thetav = (\theta{v,v'}){v' \in ch(v)}) satisfies (\sum{v' \in ch(v)} \theta{v,v'} = 1) and (\theta{v,v'} \in (0,1)). The probability of any root-to-leaf path is the product of the transition probabilities along its edges [28].

To create a CEG, the probability tree is first transformed into a staged tree by coloring its situations. Two situations (vi) and (vj) are assigned the same color (i.e., are in the same stage) if their associated probability vectors (\theta{vi}) and (\theta{vj}) are identical, meaning they share the same conditional probability distribution over subsequent events. This coloring embeds conditional independence statements into the tree model [28] [14]. The final CEG is constructed from the staged tree by merging situations that have isomorphic subtrees and identical coloring patterns. All leaf nodes are coalesced into a single sink node, simplifying the graph while preserving all possible unfoldings of events represented by the root-to-leaf paths [14].

Advantages over Other Graphical Models

CEGs address several limitations of traditional graphical tools like Fault Trees (FTs) and Bayesian Networks (BNs). The table below summarizes the key comparative advantages of CEGs.

Table 1: Comparison of Graphical Models for Asymmetric and Time-Ordered Processes

Feature	Fault Trees (FTs)	Bayesian Networks (BNs)	Chain Event Graphs (CEGs)
Representation of Asymmetry	Limited, structured top-down logic [28]	Poor, requires symmetric variable states [14]	Excellent, directly represents asymmetric paths in its topology [28]
Explicit Temporal Order	No explicit partial order [28]	Not inherently displayed [14]	Yes, paths explicitly show event sequences [14]
Context-Specific Independence	Not represented	Limited representation	Explicitly represented through staging [28]
Causal Intervention Modeling	Not standard	Limited to atomic interventions (e.g., do-calculus) [28]	Flexible, supports novel interventions like remedial maintenance [28]
Handling of Probability Propagation	Not seamless [28]	Excellent for symmetric problems [28]	Excellent, manages uncertainty in complex paths [28] [14]

For activity-level proposition research, the most significant advantage is the CEG's ability to naturally model asymmetric developments. In a BN, constructing a variable to represent a sequence of activities can be awkward and may obscure the natural timeline. In a CEG, each possible storyline proposed by different parties (e.g., prosecution vs. defense in a forensic case) is directly represented by a subset of root-to-leaf paths, making the model intuitive for explaining the unfolding of events step-by-step [14].

Application Protocol: Reliability Analysis with Remedial Interventions

This protocol details the application of CEGs for causal reliability analysis, particularly focusing on modeling the effects of remedial maintenance.

Objective: To create a comprehensive event tree representing all possible sequences of component states and failures in a system.

Materials and Reagents:

System Schematics: Detailed diagrams of the system's components and their functional dependencies.
Historical Failure Data: Data on past failure incidents, root causes, and maintenance actions.
Expert Elicitation Tools: Protocols for interviewing system engineers and maintenance technicians to identify potential failure modes and sequences.

Methodology:

Identify Initial Event: Define the root vertex (v_0) representing the system in its fully functional "as good as new" state.
Map Component States: For each logical point of failure or state change in the system, create a situation in the event tree. The edges emanating from a situation represent the possible subsequent states of a component (e.g., {Operational, Failed}).
Develop Paths to Failure: Continue expanding the tree until all plausible sequences of events leading to either system failure or continued operation are represented as leaves. This includes paths where failures are detected and remedied.
Assign Prior Probabilities: Using historical data and expert judgment, estimate the initial transition probabilities (\theta_{v,v'}) for each edge in the tree.

Stage 2: Staging and CEG Construction

Objective: To transform the elaborated event tree into a Chain Event Graph by identifying situations with identical future prognoses.

Methodology:

Identify Stages: Statistically analyze the probability vectors (\theta_v) for each situation. Group situations into the same stage if their vectors are statistically indistinguishable. This means that given the process has reached either situation, the probability distribution over what happens next is the same [28].
Construct the CEG:
- Merge situations that are in the same stage and whose subsequent subtrees are isomorphic.
- Collapse all leaf vertices into a single sink node, denoted w∞.
- Retain the stage coloring on the CEG vertices to visually represent the embedded conditional independence relationships.

Graphviz DOT Script for a Generic CEG in Reliability Analysis

Stage 3: Causal Intervention and Remedial Maintenance Analysis

Objective: To use the CEG to model the causal effect of a remedial intervention, which fixes a root cause and returns the system to an "as good as new" state [28].

Materials and Reagents:

CEG Model: The CEG constructed in Stage 2.
Intervention Specification: A clear definition of the remedial action to be modeled (e.g., "replace Component A").
Algebraic Manipulation Framework: The mathematical formalism for modifying edge probabilities on the CEG to represent the intervention.

Methodology:

Define the Remedial Intervention: Formally define the intervention as a manipulation that sets the probability of transitions along edges representing the remedial action to 1 and the probabilities of all other edges emanating from the same situations to 0. This is a stochastic manipulation on the CEG [28].
Apply the Causal Algebra: Implement the intervention on the CEG. For example, a remedial intervention at a vertex representing "Root Cause 1 Identified" would force the system along the "Remedy Applied" edge with probability 1, effectively directing the system back to a functional state (e.g., the root vertex or the sink node representing a working state).
Identify Causal Effects: Use the adapted backdoor theorem for CEGs to determine if the effect of this intervention on a downstream event (e.g., overall system failure) is identifiable from observational data, even when the system is only partially observed [28]. This involves checking for a suitable set of non-descendant situations that block all backdoor paths between the intervention and outcome nodes.
Calculate Post-Intervention Probabilities: Compute the new probabilities of paths and outcomes under the intervened model. This allows for the quantitative prediction of how the remedial action improves system reliability and reduces the probability of failure.

Table 2: Quantitative Outcomes of Remedial Interventions on a Simulated System

Intervention Type	Target Component/Path	Pre-Intervention Failure Probability	Post-Intervention Failure Probability	Relative Risk Reduction
Atomic (BN-style)	Component A	0.065	0.045	30.8%
Remedial (CEG)	Root Cause 1	0.065	0.015	76.9%
Remedial (CEG)	Root Cause 2	0.065	0.025	61.5%
Combined Remedial	Root Causes 1 & 2	0.065	0.005	92.3%

Application Protocol: Assessing Activity-Level Propositions in Forensic Science

This protocol adapts CEGs for evaluating competing activity-level propositions based on evidence, using a drug trafficking case with contaminated banknotes as an example [14].

Stage 1: Define Propositions and Elicit Storylines

Objective: To formally define the competing activity-level propositions from prosecution and defense and map them onto the structure of an event tree.

Materials and Reagents:

Case Evidence: All agreed-upon facts (e.g., (107,000 seized, suspect is a drug user) [14].
Expert Knowledge: Data on drug transfer probabilities, background contamination levels, and patterns of drug use.

Methodology:

Frame Propositions: Define the prosecution proposition ((Hp)) and the defense proposition ((Hd)). For example:
- (Hp): The banknotes are the proceeds of drug trafficking.
- (Hd): The banknotes became contaminated through the suspect's personal drug use and innocent circulation.
Build the Event Tree: Construct a probability tree where the root vertex is the initial state (e.g., "Suspect acquires banknotes"). Develop two main branches of the tree: one for the sequence of activities under (Hp) (e.g., handling drug proceeds) and one for (Hd) (e.g., using notes for personal expenses while being a drug user). Each path must culminate in the observed evidence (e.g., "Drug traces found on notes").

Stage 2: Construct a Modified CEG for Forensic Comparison

Objective: To build a CEG that is simplified and tailored for forensic presentation and likelihood ratio calculation.

Methodology:

Stage the Tree: As in the reliability protocol, identify situations across both propositions that share the same probabilistic future, grouping them into stages.
Build and Modify the CEG:
- Construct the CEG from the staged tree.
- Condition on Accepted Evidence: Simplify the CEG by removing all paths that are inconsistent with the evidence accepted in court [14].
- Introduce Dual Sink Nodes: Instead of a single sink node, create two sink nodes: one representing storylines that support the prosecution case and another supporting the defense case. This visually separates the outcomes for the two propositions [14].

Graphviz DOT Script for a Forensic CEG with Dual Sinks

Stage 3: Calculate Likelihood Ratios for Evidence

Objective: To compute the likelihood ratio (LR) that quantifies the support given by the evidence to the prosecution's proposition relative to the defense's proposition.

Materials and Reagents:

Conditioned CEG Model: The modified CEG from Stage 2.
Probability Assignments: probabilities for edges based on expert judgment, experimental data, and relevant databases.

Methodology:

Calculate Path Probabilities: For the paths leading to the Prosecution Sink under (Hp), calculate the combined probability (P(E \mid Hp)). This is the probability of the evidence given the prosecution's storyline.
Calculate (P(E \mid Hd)): Similarly, calculate the combined probability of the paths leading to the Defense Sink under (Hd).
Compute the Likelihood Ratio: Apply the formula: [ LR = \frac{P(E \mid Hp)}{P(E \mid Hd)} ] A LR greater than 1 supports the prosecution's proposition, while a value less than 1 supports the defense's proposition [14]. The CEG framework ensures that all plausible, evidence-compatible storylines under each proposition are accounted for in this calculation.

Table 3: Key Research Reagents and Computational Tools for CEG Analysis

Tool / Reagent	Type	Function in CEG Research	Example / Note
cegpy	Software Library	A Python package for constructing and performing inference on Chain Event Graphs [29].	Enables computational implementation of the protocols described herein.
Historical System Data	Data	Provides the empirical basis for estimating prior transition probabilities (\theta_{v,v'}) in the initial event tree.	System maintenance logs, failure reports.
Expert Elicitation Framework	Methodology	A structured process for gathering qualitative knowledge about process structure and quantitative estimates of probabilities from domain experts.	Critical for building models in data-sparse environments.
Staging & Model Selection Algorithms	Computational Algorithm	Identifies optimal stage structures from data, balancing model complexity with goodness-of-fit.	Uses likelihood-based or Bayesian information criteria.
Causal Algebra Formalism	Mathematical Framework	The set of rules for manipulating edge probabilities on the CEG to represent various types of interventions [28].	Essential for causal reliability analysis and remedial intervention modeling.
Likelihood Ratio Calculator	Software Module	A tool to compute the LR by summing probabilities of evidence-conditioned paths under competing propositions in a forensic CEG.	Can be implemented as part of a larger CEG software suite.

In forensic science, the evolution from source-level to activity-level propositions represents a significant shift in how evidence is evaluated for the court. Source-level questions ask, "Is this DNA or drug trace from this specific person or item?" In contrast, activity-level questions address, "How did this individual's cell material or drug residue get onto this item, and what activities does this imply?" [23] [1]. This application note demonstrates how Chain Event Graphs (CEGs), a robust probabilistic graphical model, provide a formal framework for evaluating activity-level propositions in a case involving drug traces on banknotes, moving beyond traditional Bayesian Networks (BNs) to handle the asymmetric and temporal nature of activities [14].

Theoretical Background: CEGs vs. Bayesian Networks

For evaluating activity-level propositions, CEGs offer distinct advantages over the more traditionally used Bayesian Networks (BNs) [14].

Table 1: Comparison of Bayesian Networks and Chain Event Graphs for Activity-Level Evaluation

Feature	Bayesian Networks (BNs)	Chain Event Graphs (CEGs)
Underlying Structure	Based on random variables and their conditional dependencies.	Constructed from an underlying probability tree of possible sequential events.
Handling Asymmetry	Limited capability; all variables must be defined across the same states.	Excellent; naturally accommodates asymmetric developments and dead ends in event sequences.
Temporal Display	Does not inherently display the temporal order of events.	Clearly displays the chronological unfolding of events and decisions.
Direct Storyline Representation	Storylines are inferred from the state of the network.	Root-to-leaf paths directly represent and display competing narratives (e.g., prosecution vs. defence).
Context-Specific Independence	Captures conditional independence.	Captures both conditional and context-specific independence.

The CEG is constructed by first drawing a probability tree of all possible scenarios. The tree's vertices (situations) and edges are then coloured to identify situations that share the same probability distributions for subsequent events, creating a staged tree. Finally, the CEG is formed by amalgamating situations where the coloured subtrees are isomorphic, creating a more compact graph that preserves all logical paths and dependencies [14].

Case Study: Drug Traces on Banknotes (Based on R. v. Compton)

Case Background and Agreed Facts

This application is based on a real-world drug trafficking case (Compton and Ors v R. [2002]). The suspect, Stephen Compton, was a known drug user. Police seized £107,000 in used banknotes from two safes at his address [14]. The core question was whether the drug traces on the notes were evidence of drug trafficking (prosecution's proposition) or merely a consequence of the suspect's personal drug use and the normal circulation of banknotes (defence's proposition) [14].

Defining the Competing Activity-Level Propositions

The evaluation requires formulating two mutually exclusive activity-level propositions [23] [1].

Prosecution Proposition (H1): The suspect is involved in drug trafficking, and the seized banknotes constitute the proceeds of this trade. The drug traces on the notes are a direct result of contamination during the process of packaging or handling bulk cash from drug sales.
Defence Proposition (H2): The suspect is not a drug trafficker but a drug user. The seized banknotes are his personal savings, and the drug traces accumulated either through his personal handling while under the influence of drugs or through prior circulation in the community.

Building the CEG for the Case

The CEG is built to model the possible pathways that could lead to the observed evidence (drug traces on a large sum of money). The following DOT script visualizes the simplified CEG for this case.

Figure 1: A simplified Chain Event Graph (CEG) for the drug-traced banknotes case. The graph models the distinct paths supporting the prosecution (red) and defence (green) propositions, demonstrating the inherent asymmetry of the activity-level narratives.

Calculating the Likelihood Ratio

The Likelihood Ratio (LR) quantifies the support the evidence provides for one proposition over the other. It is calculated as the probability of the evidence under the prosecution proposition divided by its probability under the defence proposition [14] [1].

LR = P(E | H1) / P(E | H2)

Where:

E is the observed evidence: Drug traces found on the large sum of banknotes seized from the suspect's safes.
H1 is the prosecution proposition.
H2 is the defence proposition.

An LR greater than 1 supports the prosecution's case, while an LR less than 1 supports the defence's case. The CEG framework allows for the incorporation of various data sources to estimate these probabilities, such as:

Empirical Data: Studies on the prevalence of drug traces on banknotes in general circulation and those seized from known drug traffickers.
Expert Elicitation: Judgements from forensic scientists on transfer and persistence probabilities for different scenarios.

Table 2: Example Quantitative Inputs for LR Calculation in CEG

Parameter	Description	Example Value (for Illustration)	Data Source
P(E \| Trafficking)	Probability of finding drug traces on notes used in drug trade.	0.95	Expert judgement, case data from known trafficking seizures [14].
P(E \| Personal Use)	Probability of finding drug traces on a large cash savings of a user.	0.40	Empirical studies on note contamination in user households.
P(High Cash \| User)	Probability a user holds large cash savings.	0.10	Demographic & financial data.
P(Drug User)	Base rate of drug use in the relevant population.	0.05	National statistics.
P(Environmental Cont.)	Probability of significant contamination from circulation.	0.70	Empirical studies on background contamination levels [14].

Experimental Protocol: Implementing the CEG Framework

This protocol outlines the steps for applying a CEG to evaluate activity-level propositions in a similar case.

Case Pre-Assessment and Scoping

Define the Competing Propositions: Work with the mandating authority (e.g., prosecution, defence) to define clear, mutually exclusive activity-level propositions [1].
Identify Agreed Facts and Evidence: List all undisputed case facts and the specific forensic evidence to be evaluated (e.g., type and quantity of drug, location of traces).
Identify Needed Data and Knowledge Bases: Determine the necessary background data (see Table 2) required to populate the CEG's probabilities.

CEG Model Construction

Develop the Event Tree: Map out all plausible sequences of activities and events from the initial context to the recovery of evidence, as defined by the competing propositions. This forms the asymmetric probability tree.
Elicit Probabilities: Assign prior probabilities to the edges of the event tree. Use relevant empirical data where available. For missing data, use structured expert elicitation.
Color and Stage the Tree: Identify situations (vertices) that share the same set of conditional probabilities for subsequent events. Color these situations the same to create the staged tree, embedding conditional independence statements.
Build the CEG: Simplify the staged tree by merging situations with isomorphic future developments. Consolidate leaf nodes into two sink nodes: one for the prosecution narrative and one for the defence narrative [14].

Inference and LR Calculation

Condition on Evidence: Enter the specific evidence from the case (e.g., "drug traces detected") into the CEG. This sets the probabilities of inconsistent paths to zero.
Calculate Path Probabilities: Compute the probabilities of all paths leading to the prosecution sink node and all paths leading to the defence sink node.
Compute the Likelihood Ratio: The LR is the ratio of the total probability of the evidence under the prosecution scenarios to the total probability under the defence scenarios, given the model structure and probabilities.

Reporting and Testimony

Structured Reporting: The report should clearly state the competing propositions, describe the CEG structure, list the data and assumptions used, and present the calculated LR with a clear statement of its meaning [1].
Visual Aid: The CEG diagram (as in Figure 1) serves as a powerful visual tool for explaining the reasoning to legal professionals and the court, making the logical framework transparent [14].

The Scientist's Toolkit: Essential Reagents & Materials

Table 3: Key Reagents and Materials for Drug Trace Analysis on Banknotes

Item	Function in Analysis
Gas Chromatography-Mass Spectrometry (GC-MS)	The confirmatory standard for drug identification. Separates chemical components (GC) and provides a unique molecular fingerprint for identification (MS) [30].
Fourier Transform Infrared (FTIR) Spectroscopy	A confirmatory technique that identifies organic functional groups and specific drug compounds based on their infrared absorption spectrum [30].
Stereo Binocular Microscope	Used for the preliminary visual examination of banknotes to identify and sample suspicious particles or powder residues [30].
Chemical Spot Test Kits	Preliminary colorimetric tests (e.g., Marquis test for opioids/amphetamines) provide an initial indication of a drug's chemical class before confirmatory analysis [30].
Polarized Light Microscope (PLM)	Used for microcrystalline tests, where a reagent is added to a sample to form crystals unique to a specific drug, providing a confirmatory identification [30].
Solvents (e.g., Methanol, Ethanol)	High-purity solvents are used to extract drug residues from the surface of banknotes or other substrates for subsequent instrumental analysis [30].

The Role of Pre-Assessment and Contextual Sampling in Casework

Within forensic science, the distinction between source level and activity level propositions is fundamental to accurately evaluating the significance of biological evidence. Source level propositions address the question "Whose DNA is this?", while activity level propositions address the more complex question "How and when did this DNA get here?" [23]. Pre-assessment and contextual sampling are critical, interdependent processes that enable forensic scientists to formulate robust, case-specific propositions and design testing strategies that yield forensically relevant and logically sound interpretations [1]. This document outlines detailed application notes and protocols for implementing these processes within the framework of source level versus activity level research.

Conceptual Framework: The Hierarchy of Propositions

The value of forensic evidence is critically dependent on the propositions put forward for evaluation. The hierarchy of propositions provides a structured approach to moving from general source identification to specific activity assessments [1] [23].

Table 1: Levels in the Hierarchy of Propositions

Level	Core Question	Example Proposition Pair	Considerations
Source Level	Whose DNA is this?	The DNA originated from Mr. Smith vs. The DNA originated from an unknown, unrelated person.	Focuses on analytical data and comparison of DNA profiles. Often uses Likelihood Ratios (LR) for evaluation [1].
Activity Level	How did the DNA get there?	Mr. Smith assaulted the victim vs. Mr. Smith had consensual contact with the victim the day before.	Requires consideration of transfer and persistence mechanisms, timing, and alternative activities [23].

It is crucial to understand that the value of evidence calculated for a DNA profile at the source level cannot be directly carried over to activity level assessments [23]. Activity level evaluation requires separate, specific calculations that incorporate data on transfer probabilities and other activity-related factors.

Pre-Assessment in Casework

Purpose and Principles

Pre-assessment is a planning phase conducted before laboratory analysis. Its primary purpose is to define the case-specific issues, formulate relevant propositions, and design an examination strategy that is logical, balanced, and transparent [1]. This is especially vital when questions relate to alleged activities, as the scientist must consider how to guide the court on issues of transfer and persistence [1].

Experimental Protocol: The Pre-Assessment Workflow

Objective: To define the scope of forensic analysis based on the framework of case circumstances and the questions posed by the mandating authority. Materials: Case information package, pre-assessment form, relevant standard operating procedures (SOPs). Procedure:

Case Information Review: Gather all task-pertinent information from the mandating authority. This includes the alleged circumstances, nature of the samples, and specific questions to be addressed [1].
Proposition Formulation: Collaboratively define at least two mutually exclusive propositions (e.g., prosecution and defense positions). These should be established before knowledge of the results to avoid bias [23].
- For Source Level: "The DNA from the trace originated from the POI" vs. "The DNA from the trace originated from an unknown individual."
- For Activity Level: "The POI stabbed the victim" vs. "An unknown person stabbed the victim, but the POI met the victim the day before" [23].
Strategy Development: Determine which items and samples to analyze, the sequence of analysis, and the most appropriate analytical methods. Identify what background data or knowledge bases (e.g., on DNA transfer rates) are needed to evaluate the results under the defined propositions [23].
Documentation: Complete a pre-assessment form that records the propositions, planned analyses, and potential outcomes and their meaning.

Diagram 1: Pre-assessment workflow for forensic casework.

Contextual Sampling Protocols

Defining Contextual Sampling

Contextual sampling involves the strategic collection of control and background samples to help interpret the primary forensic findings. This practice is essential for distinguishing between alternative activity level propositions and for minimizing the risk of contextual bias by ensuring all plausible explanations are investigated [31].

Experimental Protocol: Sampling for Activity Level Evaluation

Objective: To collect samples that will allow for the evaluation of DNA results given activity level propositions, including the assessment of background DNA and potential secondary transfer. Materials: Sterile swabs, distilled water, cutting instruments, sample containers, personal protective equipment.

Procedure:

Sample the Core Stain/Item: Using a sterile swab moistened with distilled water, sample the primary area of interest.
Collect Substrate Controls: Sample an area of the substrate (e.g., fabric, surface) adjacent to the stain where no visible biological material is present. This controls for substrate-specific inhibitors and background DNA.
Collect Background Samples: Based on the alleged activities, sample relevant environmental surfaces.
- Example for an assault allegation: If an activity level proposition involves a handshake, sample the victim's hands and the suspect's hands to assess the background prevalence of each other's DNA.
Consider Reference Samples: Obtain reference samples from all individuals involved (e.g., victim, suspect, inhabitants of a location) to distinguish their profiles from unknown contributors.
Document Sampling Locations: Photograph and meticulously document the location and description of every sample taken to preserve contextual meaning.

Analytical Techniques & The Scientist's Toolkit

Direct PCR Amplification

Direct PCR bypasses DNA extraction and quantification, amplifying a portion of the sample directly. This method offers key advantages for certain casework scenarios, including reduced turnaround time, decreased contamination risk, and, crucially, no sample loss from extraction—preserving material for further testing [32].

Table 2: Key Research Reagent Solutions for Direct PCR

Reagent/Material	Function/Application	Key Considerations
AmpFlSTR Identifiler Plus	Commercial autosomal STR amplification kit for direct PCR.	Contains 15 autosomal STR loci and Amelogenin. More robust to inhibitors than earlier versions [32].
PowerPlex 18D / 21 / Fusion	Commercial autosomal STR amplification kits designed for or validated for direct amplification.	Performance varies; some studies show equal performance across kits for blood and saliva [32].
Micro Punches (1x1 mm)	For sampling small areas of swabs or stains on porous substrates.	Optimizes input material, helps prevent inhibitor overload and DNA template overload [32].
Distilled Water	Sample pre-treatment for blood swabs.	Reduces inhibitor content (e.g., haemoglobin) without the need for extraction kits [32].
Half-Volume Reactions	A PCR reaction run at half the standard volume.	Found to be suitable for direct amplification, conserving reagents while producing reliable profiles [32].

Experimental Protocol (Direct PCR for Blood Swabs) [32]:

Sampling: Use a 1x1 mm micro-punch from a bloodstain on a fabric swab.
Amplification: Load the punch directly into a half-volume PCR reaction using a kit such as Identifiler Plus.
PCR Cycling: Use 29 cycles with standard cycling conditions.
Analysis: Analyze the PCR product via capillary electrophoresis. Stochastic effects are possible but are not reported at a higher rate than conventional processing.

Diagram 2: Analytical pathway for direct PCR.

Evaluation and Reporting

The evaluation of findings follows a Bayesian framework, calculating a Likelihood Ratio (LR) to express the strength of the evidence. The formula is: LR = Pr(E | Hp, I) / Pr(E | Hd, I) Where Pr is probability, E is the evidence, Hp is the prosecution proposition, Hd is the defense proposition, and I is the case background information [1] [23].

Barriers to Adoption: Despite its logical rigor, the global adoption of evaluative reporting for activity level propositions faces barriers. These include a lack of robust data to inform probabilities, regional differences in regulations, and a need for specialized training [10]. This underscores the importance of research to build relevant knowledge bases on DNA transfer, persistence, and prevalence.

Pre-assessment and contextual sampling are not merely administrative steps but are the foundation of a robust, logical, and transparent forensic science process. By rigorously applying these protocols, forensic scientists can effectively navigate the critical distinction between source level and activity level propositions. This ensures that the evaluation of biological evidence is conducted in a manner that truly assists the trier of fact in understanding the issues in the case, thereby strengthening the administration of justice.

Overcoming Barriers to Global Adoption of Activity-Level Propositions

In forensic science, a paradigm shift is occurring, moving from traditional source-level propositions (which address the origin of a biological sample) to activity-level propositions (which address how that material was transferred during an alleged event) [3] [2]. This transition is critical because it addresses the fundamental questions in legal proceedings: not just "Whose DNA is this?" but "How did it get there?" [2]. However, the implementation of activity-level evaluations faces significant barriers related to data fragmentation, specialized training requirements, and deep-seated methodological conservatism. This application note details these barriers and provides structured protocols to facilitate the adoption of robust, empirically-supported activity-level evaluations in forensic casework, framed within broader research on source level versus activity level propositions.

Major Implementation Barriers in Activity-Level Evaluation

Data Availability and Integration Challenges

The evaluation of evidence given activity-level propositions requires extensive data on transfer, persistence, prevalence, and recovery (TPPR) phenomena, which are often unavailable or fragmented [2] [33].

Table 1: Primary Data-Related Barriers in Activity-Level Evaluation

Barrier Category	Specific Challenge	Impact on Evaluation
Data Fragmentation	Reluctance among stakeholders to share data due to competing priorities, privacy concerns, and institutional silos [34]	Prevents creation of "deep data" resources necessary for robust TPPR parameter estimation
Relevance & Specificity	Difficulty applying controlled experimental data to unique case circumstances with unknown variables [2]	Creates reluctance to use available numerical values from laboratory studies in real-case evaluations
Knowledge Base Gaps	Lack of relevant, traceable data on variables influencing transfer and persistence for specific scenarios [23] [33]	Undermines probabilistic assessments and prevents standardization across casework

Workforce and Training Deficiencies

A significant obstacle is the lack of personnel adequately trained in the specialized methodologies required for activity-level evaluation [3] [2].

Methodological Familiarity: Many forensic scientists are steeped in traditional source-level approaches and lack experience with observational research methods and complex statistical analyses required for activity-level evaluation [34].
Interdisciplinary Gaps: Activity-level assessment requires knowledge spanning forensic genetics, statistics, and scenario modeling, creating a workforce capacity challenge [2] [33].
Generational Hesitance: Younger investigators may be hesitant to pursue novel methodological approaches before establishing themselves with traditional methods [34].

Cultural and Methodological Reticence

A deeply ingrained "cultural gravitational pull" back to traditional established methods represents a profound barrier [34] [2].

Table 2: Cultural and Perceptual Barriers to Implementation

Resistance Factor	Manifestation	Consequence
Methodological Mistrust	Discomfort with non-interventional research methods and preference for traditional RCT-like approaches [34]	Reluctance to adopt Bayesian networks and likelihood ratio frameworks for activity-level propositions
Procedural Inertia	Organizational systems and processes optimized for traditional evidence generation [34]	Lack of infrastructure for efficient implementation of alternative evaluation designs
Perceived Speculation	View that activity-level evaluations are overly speculative due to multiple unknown variables [2]	Avoidance of activity-level reporting despite its potential value to judicial decision-making

Quantitative Framework for Barrier Assessment

Understanding the relative impact of different variables on activity-level evaluations is essential for prioritizing research and resource allocation.

Table 3: Experimental Data Requirements for Activity-Level Evaluation

Variable Category	Data Type	Collection Method	Implementation Use
Transfer Probabilities	Quantitative DNA recovery amounts under different contact scenarios [35]	Controlled simulation experiments mimicking alleged activities	Informs likelihood ratio calculations for transfer events
Persistence Metrics	Time-dependent degradation rates of biological material on various surfaces [2]	Longitudinal studies measuring DNA recovery over time	Supports temporal assessments of alleged activities
Background Prevalence	DNA profile occurrence in relevant environments and populations [2]	Systematic sampling of public spaces, clothing, and surfaces	Provides context for evaluating the significance of findings
Recovery Efficiencies	Extraction and analysis yields across different sample types and collection methods [33]	Comparison studies using standardized sampling protocols	Adjusts for methodological limitations in evidence processing

Experimental Protocols for Activity-Level Evidence Generation

Protocol 1: Direct vs. Indirect DNA Transfer Simulation

Purpose: To generate quantitative data for evaluating competing activity-level propositions regarding direct contact versus indirect transfer [35].

Materials:

DNA-free substrates (e.g., knife sheaths, fabric samples, hard surfaces)
Sterile sampling equipment (swabs, collection tubes, extraction kits)
Quantitative PCR instrumentation
DNA profiling systems

Procedure:

Participant Selection: Recruit donors with varying shedder status (confirmed by prior measurement)
Direct Transfer Simulation:
- Have donor directly handle substrate for specified duration and pressure
- Sample using standardized moistened swab over 4cm² area
- Extract DNA using robotic extraction system
- Quantify DNA yield using qPCR
- Generate DNA profile, recording profile quality and mixture proportions
Indirect Transfer Simulation:
- Have donor shake hands with intermediary for 10 seconds
- Within 1 minute, have intermediary handle substrate identically to direct condition
- Sample, extract, and analyze identically to direct condition
Data Analysis:
- Compare DNA quantities between direct and indirect conditions using t-tests
- Analyze profile characteristics (major/minor contributor status, mixture ratios)
- Calculate likelihood ratios for observed findings given each transfer mechanism

Protocol 2: Bayesian Network Construction for Case-Specific Evaluation

Purpose: To create a transparent, probabilistic model for evaluating forensic findings given activity-level propositions using an idiom-based approach [33].

Materials:

Bayesian network software (e.g., GeNIe, Hugin)
Relevant experimental data on TPPR variables
Case circumstances and competing propositions

Procedure:

Define Competing Propositions:
- Formulate prosecution proposition (Hp) specifying alleged activity
- Formulate defense proposition (Hd) specifying alternative activity
- Ensure propositions are mutually exclusive and address the same activity [3]
Identify Relevant Factors:
- Determine which TPPR variables influence the findings under each proposition
- Identify relevant background data and prevalence information
Select and Combine Idioms:
- Choose appropriate reasoning patterns from idiom categories [33]:
  - Cause-Consequence Idioms: Model relationships between activities and findings
  - Narrative Idioms: Represent scenario coherence and storytelling elements
  - Synthesis Idioms: Combine multiple nodes for computational efficiency
  - Hypothesis-Conditioning Idioms: Add pre/postconditions to case hypotheses
  - Evidence-Conditioning Idioms: Add conditions to evidence interpretation
Network Construction:
- Implement selected idioms as network fragments
- Connect fragments to form complete case model
- Parameterize conditional probability tables using available data and expert knowledge
Validation and Sensitivity Analysis:
- Test model with known scenarios to verify performance
- Conduct sensitivity analysis to identify most influential variables
- Refine model based on results and peer feedback

Visualization of Methodological Frameworks

Hierarchical Evidence Evaluation Framework

Bayesian Network for Transfer Evidence

The Scientist's Toolkit: Essential Research Reagents and Materials

Table 4: Key Research Reagent Solutions for Activity-Level Studies

Reagent/Material	Function	Application Example
DNA-free substrates	Provides controlled surfaces for transfer studies	Testing transfer efficiency across fabric, metal, plastic surfaces
Quantitative PCR kits	Precisely measures DNA quantities recovered	Establishing transfer probability distributions for different activities
Standardized sampling kits	Ensures consistent collection of biological material	Validating recovery rates across different operators and conditions
Probabilistic genotyping software	Interprets complex DNA mixture data	Supporting source-level evaluations as foundation for activity assessment
Bayesian network platforms	Implements probabilistic reasoning frameworks	Constructing case-specific models for evaluating activity propositions
Synthetic DNA controls	Provides reference material for validation studies	Establishing baseline performance metrics without donor variability

In forensic science research, the distinction between source level and activity level propositions is fundamental. Source level propositions seek to identify the origin of a piece of evidence (e.g., "Does this DNA come from this person?"), while activity level propositions interpret the actions that led to the evidence's deposition (e.g., "How did this DNA get onto this object?") [36]. This application note posits that robustly addressing data gaps in both contexts necessitates a case study research approach. Such an approach provides the case-specific contextual samples required to move beyond mere identification and toward meaningful interpretation of complex, real-world scenarios [37] [36].

A case study is a detailed, holistic, and contextualized account of a real-world phenomenon, bounded by time and space [37] [36]. Its unique strength in propositions research lies in its ability to integrate multiple data sources—both quantitative and qualitative—to construct a thick description of the case [37]. This multimethod nature is critical for closing data gaps that emerge from the inherent complexity of forensic activities, where laboratory data alone may be insufficient to reconstruct events.

Application Notes: A Framework for Contextual Sampling

The following protocols are designed to guide researchers in implementing a case-specific approach to address data gaps in propositions research.

Protocol for a Multiple-Case Study Design on Transfer Mechanisms

Aim: To investigate the variability and persistence of a specific trace evidence (e.g., DNA, fibres, GSR) under different activity scenarios.

Step 1: Case Definition and Bounding
- Define the unit of analysis (the "case"). In this design, each "case" is a discrete, controlled activity scenario (e.g., "direct contact," "secondary transfer," "environmental background").
- Set clear boundaries for each case, including the duration of activity, individuals involved, surfaces used, and environmental conditions [37].
Step 2: Theoretical Framework Development
- Ground the study in existing theory on trace evidence dynamics and transfer mechanisms. This framework guides data collection and analysis, ensuring the study contributes to generalizable knowledge (a theoretical case study) [36].
- Formulate initial propositions (e.g., "Secondary transfer will yield a quantitatively and qualitatively different DNA profile than direct contact").
Step 3: Multimodal Data Collection
- Quantitative Data:
  - Collect samples from pre-defined locations using standardized swabbing or lifting techniques.
  - Generate quantitative data from these samples, such as DNA quantity (RFU), particle counts, or fibre densities.
- Qualitative Data:
  - Video-record the activities to document the precise nature and sequence of interactions.
  - Conduct post-activity interviews with participants to gather data on their perceptions and any unplanned interactions.
Step 4: Within-Case and Cross-Case Analysis
- Perform a within-case analysis for each activity scenario, integrating the quantitative results with the qualitative observations to build a coherent narrative of the transfer event [38].
- Conduct a cross-case analysis to compare and contrast the findings across the different scenarios, identifying patterns and causal mechanisms that explain the observed data gaps [36].
Step 5: Theory Modification and Protocol Refinement
- Use the findings to refine existing theoretical models of evidence transfer.
- Develop evidence-based protocols for the collection and interpretation of trace evidence in real casework, specifically targeting previously identified data gaps.

Protocol for an Intrinsic Case Study on a Complex Crime Scene

Aim: To achieve a deep, contextual understanding of a single, unique criminal incident where standard sampling approaches have led to ambiguous or conflicting results.

Step 1: Case Selection
- Select a single, specific past case that is intrinsically valuable due to its complexity, unusual features, or its challenge to existing interpretation frameworks [38].
Step 2: Historical Data Reconstruction
- Assemble all available case data: forensic reports (including raw data where possible), crime scene photographs, witness statements, and investigator notes.
- Treat this documentation as a secondary data source for a comprehensive review [39].
Step 3: Contextual Data Integration
- Where feasible, conduct interviews with the original scene examiners, analysts, and investigators to gather contextual data on decision-making processes, perceived constraints, and alternative hypotheses that may not be captured in official reports [36].
- Map the spatial and temporal relationships of exhibits and samples.
Step 4: Narrative Construction and Triangulation
- Construct a detailed timeline and narrative of the case, from the initial incident through to the laboratory analysis [36].
- Triangulate all data sources—physical evidence, documentary evidence, and testimonial evidence—to identify consistencies, contradictions, and, crucially, the root causes of interpretive data gaps.
Step 5: Generating Transferable Insights
- While the findings are specific to the single case, the methodological approach and the identified pitfalls in sampling and interpretation provide critical, transferable insights for the broader field, highlighting systemic vulnerabilities and areas for improvement in standard protocols.

Quantitative Data Management and Analysis Protocols

The quantitative data derived from contextual samples require rigorous management to ensure validity.

Table 1: Data Quality Assurance Protocol [40]

Stage	Procedure	Action
Data Cleaning	Check for duplicates.	Remove identical participant/data records.
	Assess missing data.	Use Little's MCAR test to determine pattern of missingness. Set and apply a completion threshold (e.g., >50%).
	Identify anomalies.	Run descriptive statistics to find values outside expected ranges (e.g., a Likert score of 6 on a 1-5 scale).
Data Analysis	Descriptive Statistics	Calculate frequencies, means, standard deviations for all variables.
	Assess Normality	Use Kolmogorov-Smirnov/Shapiro-Wilk tests and evaluate Skewness/Kurtosis (±2) [40].
	Inferential Statistics	Based on normality, use parametric (e.g., t-tests, ANOVA) or non-parametric tests (e.g., Mann-Whitney U, Chi-square).

Handling Missing Quantitative Data: The following techniques are essential for addressing data gaps within datasets themselves [40] [41].
Table 2: Data Imputation Techniques for Missing Data [41]

Technique	Description	Best Use Case
Mean Substitution	Replaces missing values with the mean of observed data for that variable.	Simple, quick fix when data is Missing Completely at Random (MCAR) and the amount of missingness is very low.
Regression Imputation	Predicts missing values using relationships with other variables in the dataset.	When a strong correlation exists between the variable with missing data and other complete variables.
Multiple Imputation	Creates several complete datasets by simulating missing values based on statistical models; results are pooled for final analysis.	Gold standard for handling data that is not MCAR; accounts for uncertainty associated with the imputed values [41].

The Scientist's Toolkit: Research Reagent Solutions

Table 3: Essential Materials for Experimental Case Studies

Item	Function in Contextual Sampling
Standardized Evidence Collection Kits (e.g., swabs, lifters, particle vacuums)	Ensures consistent, comparable, and court-defensible sampling of trace materials across different cases or scenarios.
Digital Video Recording System	Provides objective, qualitative data on activities and interactions, allowing for precise correlation with physical sample locations.
Statistical Software (e.g., IBM SPSS, R, Stata)	Facilitates data cleaning, imputation, descriptive statistics, and advanced inferential analysis to quantify patterns and test hypotheses [41].
Qualitative Data Analysis Software (e.g., NVivo, Dovetail)	Aids in the systematic coding and analysis of interview transcripts, field notes, and documents, enabling integration with quantitative findings [38].

Workflow Visualization

The following diagram illustrates the integrated, iterative workflow for conducting case study research to address data gaps in propositions research.

Forensic science is undergoing a fundamental paradigm shift from addressing source-level questions to tackling more complex activity-level propositions. While source-level propositions concern the origin of biological material (e.g., "Does this DNA come from Mr. A?"), activity-level propositions address how that material was transferred through specific actions (e.g., "Did Mr. A punch the victim?") [2]. This transition is driven by the recognition that with modern DNA profiling technology capable of producing results from minute quantities of material, the issue of source is becoming less frequently contested in judicial proceedings [2]. The critical question has evolved from "Whose DNA is this?" to "How did it get there?" [2]. This shift necessitates more sophisticated strategies for probability assignment that extend beyond generic population statistics to incorporate case-specific circumstances, transfer mechanisms, and persistence factors. The evaluation of biological traces considering activity level propositions represents an essential advancement for forensic science to provide more focused and useful contributions to the criminal justice process [23] [2].

Theoretical Framework: From Source to Activity Level Assessment

Fundamental Conceptual Distinctions

The hierarchy of propositions represents a fundamental concept for the evaluation of biological results, creating critical distinctions between source-level and activity-level assessments [7]. At the source level, propositions focus solely on the origin of the biological material, typically requiring primarily an assessment of the rarity of the corresponding analytical features in the relevant population [2]. In contrast, activity-level propositions demand a more comprehensive probabilistic framework that incorporates additional factors including transfer mechanisms, persistence characteristics, background presence of DNA, and the specific contextual details of the case circumstances [2]. It is crucial to recognize that the value of evidence calculated for a DNA profile at the source level cannot be directly carried over to higher levels in the hierarchy—the calculations given sub-source, source, and activity level propositions are all separate evaluations [23].

The Bayesian Framework for Activity-Level Assessment

The evaluation of scientific results with activity level propositions employs a Bayesian framework to derive a likelihood ratio (LR). The scientist assigns the probability of the evidence under each of the alternate propositions to compute [23]:

LR = Pr(E|H₁) / Pr(E|H₂)

Where E represents the forensic findings, H₁ represents the prosecution proposition, and H₂ represents the defense proposition. For activity-level assessments, this requires the scientist to address two fundamental questions: (a) "What are the expectations if each of the propositions is true?" and (b) "What data are available to assist in the evaluation of the results given the propositions?" [23]. This framework provides a transparent methodology for experts to evaluate a case, creating a forum where differences of opinion may be discussed and resolved within the judicial process [2].

Table 1: Key Differences Between Source and Activity Level Propositions

Assessment Factor	Source Level Propositions	Activity Level Propositions
Focus Question	"Whose DNA is this?"	"How did the DNA get there?"
Primary Input	Profile rarity in population	Transfer mechanisms, persistence, background
Data Requirements	Population databases	Case-specific experimental data
Complexity	Relatively straightforward	Multifactorial, complex
Typical Output	Random match probability	Likelihood ratio incorporating activities

Experimental Protocols for Activity-Level Probability Assignment

Pre-Assessment Case Evaluation Protocol

The pre-assessment phase is critically important when questions relate to alleged activities, as it allows scientists to determine whether they possess the necessary data and expertise to provide meaningful evaluation before committing to full analysis [7].

Procedure:

Case Information Review: Systematically document all relevant case circumstances, including alleged activities, timing, locations, and individuals involved.
Proposition Formulation: Define specific, alternative activity-level propositions for prosecution and defense positions. Suitable activity level propositions should be set before knowledge of the results and address issues like: "X stabbed Y" versus "An unknown person stabbed Y but X met Y the day before" [23].
Data Gap Analysis: Identify what specific data is needed to evaluate results under each proposition, including transfer probabilities, persistence rates, and background prevalence.
Feasibility Assessment: Determine whether sufficient relevant data exists through available knowledge bases or whether new experimental work is required.
Evaluation Planning: Develop a specific plan for how the findings will be evaluated given the identified propositions and available data sources.

Knowledge Base Development Protocol

To assign probabilities for activity-level evaluations, analysts should collect data that are relevant to the case in question [23]. This protocol establishes methodology for creating case-relevant knowledge bases.

Procedure:

Variable Identification: Identify key variables that may impact transfer and persistence characteristics (e.g., surface type, contact duration, pressure, shedder status, environmental conditions).
Experimental Design: Develop controlled experiments that systematically vary identified parameters while holding others constant. The uncertainty arising from unmeasurable aspects of alleged activities will present themselves in the spread of the obtained data [2].
Data Collection: Execute experiments with sufficient replication to capture variability, documenting all relevant contextual factors.
Statistical Modeling: Analyze results to develop probabilistic models that describe transfer and persistence characteristics under different conditions.
Validation: Verify model performance against independent data sets and document limitations and assumptions.

Bayesian Network Development for Case Assessment

Bayesian Networks are extremely useful to help think about complex problems because they force consideration of all relevant possibilities in a logical way [23]. They provide a structured methodology for incorporating multiple probabilistic factors in activity-level assessments.

Procedure:

Node Identification: Identify all relevant variables (nodes) in the network, including observable evidence, unobservable events, and case circumstances.
Structure Development: Define directional relationships between nodes based on known dependencies.
Probability Assignment: Populate conditional probability tables for each node based on experimental data, literature values, or expert knowledge when data is limited.
Network Validation: Test network behavior under known scenarios to verify logical consistency.
Case Application: Apply the network to case-specific circumstances by instantiating known evidence and computing probabilities for competing propositions.

Diagram 1: Activity Level Assessment Network

Implementation Framework and Tools

The Scientist's Toolkit: Essential Research Reagent Solutions

Table 2: Key Research Reagents and Materials for Transfer and Persistence Studies

Item	Function	Application Notes
Synthetic Skin Substrates	Simulates human skin for transfer studies	Varies in porosity and surface texture; select based on case circumstances
DNA Standards	Quantification and profiling controls	Enables standardization across experiments
Surface Sampling Kits	Recovery of DNA from various surfaces	Efficiency varies by surface type; must validate for each material
Environmental Chambers	Controls temperature, humidity, light	Simulates realistic environmental conditions for persistence studies
Shedder Status Assay	Classifies DNA shedding propensity	Critical individual factor affecting transfer probabilities
Statistical Software	Bayesian analysis and modeling	Enables computation of likelihood ratios and probabilistic assessment

Case-Relevant Probability Assignment Workflow

The following workflow provides a structured approach for moving beyond generic probabilities to case-relevant assignment:

Diagram 2: Probability Assignment Workflow

Quantitative Framework for Activity-Level Probability Assignment

Table 3: Probability Factors in Activity-Level Assessment

Factor	Measurement Approach	Data Input Requirements
Transfer Probability	Controlled transfer studies under varying conditions	Contact type, pressure, duration, surface materials, shedder status
Persistence Rate	Time-series sampling after controlled deposition	Environmental conditions, surface properties, clothing materials
Background Prevalence	Systematic sampling of relevant populations and environments	Demographic factors, environment type, occupational exposure
Recovery Efficiency	Comparison of known deposits with recovery yields	Sampling method, substrate, analyst expertise
Analysis Sensitivity	Probability of detection given specific DNA quantity and quality	Instrumentation, chemistry, degradation factors

Advanced Methodologies for Complex Case Assessment

Addressing Transfer Mechanisms in Probability Assignment

When evaluating activity-level propositions, scientists must consider the mechanisms of DNA transfer—primary (direct), secondary (indirect), and tertiary transfer—and their associated probabilities. The assignment of probabilities must account for:

Direct Transfer Modeling:

Develop transfer probabilities based on contact type (grabbing, punching, touching)
Incorporate temporal factors (duration, pressure, recent washing)
Account for individual shedder status variability

Indirect Transfer Modeling:

Establish probabilities for transfer through intermediaries
Consider vector properties and transfer efficiency
Model complex transfer pathways through multiple intermediates

The probabilistic assessment requires distinguishing between results, propositions, and explanations, recognizing that while propositions are assessed by the Court, DNA transfer is a factor that scientists need to take into account for the interpretation of their results [23].

Uncertainty Quantification and Sensitivity Analysis

A common concern in activity-level probability assignment is the handling of uncertainty in the numerous variables involved. Sensitivity analysis provides a methodology to determine how much effect any one of the unknown factors has on the value of the findings [2]. The protocol includes:

Procedure:

Parameter Range Identification: Define plausible ranges for all uncertain parameters based on available data.
Systematic Variation: Compute likelihood ratios while systematically varying parameters across their plausible ranges.
Impact Assessment: Identify which parameters have the greatest effect on the computed likelihood ratios.
Focus Refinement: Direct additional research toward high-impact parameters where reduced uncertainty would most improve reliability.
Reporting Framework: Document the sensitivity of conclusions to key assumptions.

When scientists encounter factors with considerable impact but uncertain states, these can be incorporated by considering all possible states within the evaluation, weighted by probabilities informed either by data from controlled experiments or supplemented by the analysts' knowledge, which should be available for disclosure and auditing [2].

Moving beyond generic experiments to case-relevant probability assignment represents both a significant challenge and essential evolution in forensic science. The framework presented here provides a structured approach for addressing activity-level propositions through rigorous experimental protocols, systematic knowledge base development, and transparent probabilistic modeling. By implementing these strategies, forensic scientists can provide more meaningful evaluations that help address the central question of how biological material came to be where it was found, ultimately enhancing the value of forensic science in the administration of justice. The successful implementation of these methodologies requires ongoing research to expand knowledge bases, refinement of Bayesian computational tools, and commitment to logical rigor in the interpretation of forensic biological evidence.

The formulation of propositions represents a foundational step in the logical framework for the interpretation of forensic evidence, acting as the crucial link between scientific findings and the legal process. Within the hierarchy of propositions, a clear distinction exists between source-level and activity-level propositions, each serving different roles in forensic evaluation. Source-level propositions concern the origin of biological material itself, such as "the person of interest is the source of the recovered DNA" versus "an unknown person is the source of the recovered DNA" [2]. In contrast, activity-level propositions address how the biological material was transferred within the context of case circumstances, for example, "Mr. A punched the victim" versus "The person who punched the victim shook hands with Mr. A" [2]. The inappropriate formulation of propositions—particularly the use of pseudo-activity and vague terminology—creates significant risks of misinterpretation by the courts and may ultimately lead to miscarriages of justice.

The evolution of DNA profiling technology, capable of producing results from minute quantities of trace material, has accelerated a critical shift in forensic science from the question "whose DNA is this?" to "how did it get there?" [2]. This paradigm shift demands greater precision in proposition formulation to ensure that forensic evaluations address the relevant questions in legal proceedings. Research demonstrates that the strength of observations evaluated under source-level propositions can differ radically from evaluations under activity-level propositions, creating substantial potential for inappropriate conclusions if the two are mistakenly considered equivalent [5]. This protocol provides detailed methodologies for constructing forensically robust propositions that accurately reflect the operational questions in legal contexts while avoiding common formulation pitfalls.

Theoretical Foundation: The Hierarchy of Propositions

Conceptual Framework and Definitions

The hierarchy of propositions provides a structured framework for positioning forensic evaluations according to their level of specificity and connection to legal issues. At its core, this hierarchy recognizes that forensic evaluations can address different levels of inquiry, from the general source of biological material to the specific activities that led to its transfer and persistence. The conceptual relationship between these levels follows a logical progression from broad source identification to specific activity inference, with each level incorporating additional case circumstances and contextual factors.

Table 1: Levels in the Hierarchy of Propositions with Examples

Level	Definition	Example Proposition Pair
Source Level	Concerns the biological source of the recovered trace material [2].	H1: The bloodstain came from the defendant.H2: The bloodstain came from another unknown individual [5].
Activity Level	Addresses the activities that led to the transfer, persistence, and detection of biological material [2].	H1: Mr. A punched the victim.H2: The person who punched the victim shook hands with Mr. A [2].

The proper application of this hierarchical framework requires careful case assessment to determine which level of proposition aligns with the factual issues in dispute. As noted in forensic guidelines, "Source level propositions are adequate in cases where there is no risk that the court will misinterpret them in the context of the alleged activities in the case" [5]. In circumstances where factors such as transfer, persistence, and background levels of DNA could crucially affect the strength of the findings, activity-level propositions become necessary to provide meaningful evaluative assistance to the judiciary.

The Logic of Activity-Level Inference

Activity-level inference extends beyond mere source identification to incorporate scientific knowledge about transfer mechanisms, persistence dynamics, and background prevalence of biological materials. The logical framework for evaluating forensic biology results given activity-level propositions requires consideration of multiple probabilistic components, including the transfer of material given specific activities, the persistence of that material over time, and the detection of the material given the analytical methods employed [9]. This multi-factor approach distinguishes activity-level evaluation from the primarily identification-focused source-level evaluation.

Bayesian networks have emerged as particularly valuable tools for structuring the complex logical relationships inherent in activity-level inference [21] [9]. These graphical models represent the probabilistic dependencies between case circumstances, activities, transfer mechanisms, and forensic observations, enabling transparent reasoning under uncertainty. Research demonstrates that narrative Bayesian networks offer a simplified methodology that aligns representations with other forensic disciplines, enhancing user-friendliness and accessibility for both experts and the courts [21]. The formal structure of these networks helps prevent the logical fallacies that commonly arise when source-level evaluations are mistakenly applied to activity-level questions.

Diagram 1: Logical Framework for Activity-Level Proposition Evaluation. This diagram illustrates the probabilistic dependencies between case circumstances, activities, transfer mechanisms, persistence factors, background levels, forensic observations, and the ultimate evaluation of activity-level propositions.

Protocol: Constructing Forensically Sound Propositions

Stage 1: Case Assessment and Information Gathering

Objective: To systematically gather and review all available case information to inform appropriate proposition formulation.

Procedure:

Case Context Analysis: Document the alleged sequence of events from all parties, including specific activities, timing, locations, and interactions relevant to the biological evidence [2].
Trace Characteristics Documentation: Record comprehensive details about the recovered biological material, including:
- Location and substrate of recovery
- Physical form (e.g., stain, swipe, droplet)
- Quantity and quality of material
- Analytical results (presumptive tests, DNA concentration, profile quality) [4]
Contextual Factor Identification: Identify factors that may influence transfer and persistence, including:
- Temporal relationships between alleged activities and evidence recovery
- Environmental conditions affecting biological material
- Relevant characteristics of individuals involved (e.g., shedder status) [2]
Stakeholder Engagement: Actively seek propositions from both prosecution and defense perspectives to ensure balanced formulation [2]. When explicit propositions are not provided, construct reasonable alternative propositions based on case information and logical inference.

Deliverable: A comprehensive case assessment report documenting the factual matrix, trace characteristics, contextual factors, and initially proposed propositions from all parties.

Stage 2: Proposition Formulation and Hierarchical Positioning

Objective: To formulate balanced, case-specific propositions at the appropriate hierarchical level.

Procedure:

Hierarchical Level Determination: Apply the following decision framework to determine the appropriate proposition level:
- If the case involves issues of transfer, persistence, or background prevalence that could affect the interpretation of findings, activity-level propositions are required [5].
- If the interpretation is unaffected by transfer and persistence considerations, source-level propositions may be sufficient [5].
Activity-Level Proposition Formulation: Construct propositions that:
- Explicitly state the alleged activities
- Incorporate relevant transfer mechanisms
- Specify temporal and contextual elements
- Example: "Mr. A had sexual intercourse with Ms. B" vs. "Mr. A and Ms. B attended the same party and had only social interaction" [2]
Pseudo-Activity Identification and Remediation: Screen proposed propositions for these common flaws:
- Vague Activity Descriptors: Replace terms like "in contact with" with specific actions (e.g., "grabbed," "sat on," "handled").
- Unexplained Transfer Pathways: Ensure propositions specify direct versus indirect transfer mechanisms.
- Temporal Ambiguity: Include relevant timeframes when persistence considerations are material.
Balance Verification: Confirm that propositions are mutually exclusive, exhaustive within the case context, and balanced in specificity and assumptions.

Deliverable: A finalized pair of propositions positioned at the appropriate hierarchical level, with documentation justifying the formulation and hierarchical positioning.

Stage 3: Bayesian Network Construction for Activity-Level Evaluation

Objective: To construct a narrative Bayesian network that transparently represents the relationship between activities, transfer mechanisms, and forensic observations.

Procedure:

Node Identification: Define the key variables in the network, including:
- Hypothesis Nodes: Representing the competing activity-level propositions
- Intermediate Nodes: Representing transfer, persistence, and background phenomena
- Evidence Nodes: Representing forensic observations (DNA results, presumptive tests) [21]
Network Structure Development: Create directed acyclic graphs representing probabilistic dependencies:
- Parent nodes should represent causally prior variables
- Child nodes should represent probabilistically dependent variables
- Conditional probability tables should reflect scientific knowledge and empirical data [21]
Parameterization: Assign probabilities to network components based on:
- Relevant experimental studies on transfer and persistence
- Case-specific circumstances and assumptions
- Clearly documented expert knowledge where empirical data is limited [2]
Sensitivity Analysis: Test the robustness of conclusions to variations in:
- Assumptions about unknown activity parameters
- Probability assignments for transfer and persistence
- Background prevalence estimates [2]

Deliverable: A fully specified Bayesian network for the case, including documentation of structure, parameterization sources, and sensitivity analysis results.

Diagram 2: Bayesian Network for DNA Transfer Evaluation. This diagram illustrates a simplified Bayesian network structure for evaluating DNA findings given activity-level propositions, incorporating transfer, persistence, and background DNA components.

Experimental Protocols: Generating Transfer and Persistence Data

Protocol: Transfer Probability Estimation for Specific Activities

Objective: To generate quantitative data on DNA transfer probabilities associated with specific activities for informing activity-level evaluations.

Experimental Design:

Activity Simulation: Design controlled experiments that mimic the alleged activities, including:
- Direct contact scenarios (e.g., grabbing, pushing, kissing)
- Indirect transfer scenarios (e.g., transfer via intermediary objects)
- Variations in contact duration, pressure, and other relevant parameters [2]
Donor Characterization: Document relevant donor characteristics:
- Shedder status classification
- Time since last hand washing
- Skin condition and moisture levels [2]
Sample Collection and Analysis:
- Use standardized swabbing techniques across all experiments
- Employ quantitative PCR for DNA quantification
- Apply profiling systems consistent with casework analysis [2]
Data Analysis: Calculate transfer probabilities as the proportion of experiments yielding detectable DNA at specific locations and quantities.

Application: The resulting transfer probabilities populate conditional probability tables in Bayesian networks for activity-level evaluation [2].

Protocol: Persistence Dynamics Under Environmental Conditions

Objective: To quantify DNA persistence over time on relevant substrates under various environmental conditions.

Experimental Design:

Substrate Selection: Include substrates relevant to the case context (e.g., clothing fabrics, hard surfaces, weapons)
Environmental Conditions: Test persistence across realistic environmental variables:
- Indoor vs. outdoor settings
- Temperature and humidity ranges
- Exposure to sunlight, moisture, and cleaning attempts [2]
Time-Series Sampling: Collect samples at multiple time points post-transfer (e.g., immediately, 1 hour, 6 hours, 24 hours, 1 week)
Data Modeling: Fit persistence decay curves to the quantitative DNA data to estimate probability of detection over time.

Application: Persistence probabilities inform temporal aspects of activity-level evaluation, particularly when timing of activities is disputed [2].

Research Reagent Solutions for Proposition-Based Research

Table 2: Essential Research Materials for Activity-Level Evidence Studies

Reagent/Material	Function in Research	Application Example
Standardized DNA Donors	Provides consistent source material for transfer studies	Recruitment of donors with characterized shedder status for transfer probability experiments [2].
Quantitative PCR Assays	Precisely measures DNA quantity recovered from substrates	Determining DNA transfer amounts under different activity scenarios [2].
Bayesian Network Software	Provides computational framework for complex probability modeling	Implementing narrative Bayesian networks for case-specific evaluation [21].
STRmix or Probabilistic Genotyping Software	Interprets complex DNA mixtures using probabilistic methods	Supporting evaluation given sub-source level propositions as part of larger activity-level framework [4].
Controlled Environment Chambers	Maintains consistent temperature, humidity for persistence studies	Testing DNA persistence under different environmental conditions [2].

Data Presentation and Analysis Framework

Quantitative Comparison of Proposition Levels

Objective: To demonstrate how the strength of forensic findings varies depending on the proposition level.

Experimental Approach: Using a simulated case scenario, compare likelihood ratios calculated for the same DNA findings under different proposition pairs.

Table 3: Likelihood Ratio Comparison Across Proposition Levels

Scenario Description	Source-Level LR	Activity-Level LR	Key Factors Influencing Difference
DNA recovered from freshly broken window	1 × 10^9	8 × 10^8	Minimal difference: Transfer and persistence factors not contested [5].
Low-level DNA from clothing after alleged assault	1 × 10^6	45	Substantial difference: Secondary transfer and background DNA probabilities reduce activity-level LR [5].
DNA from handled object with innocent explanation	1 × 10^7	2	Dramatic difference: High background prevalence and indirect transfer pathways [2].

The data illustrate a crucial principle: the probative value of DNA evidence assessed at source level can differ radically from its value assessed at activity level, particularly when transfer mechanisms, persistence, and background prevalence introduce alternative explanations for the presence of DNA [2] [5]. This emphasizes the importance of matching proposition level to the facts actually in dispute.

The formulation of precise, hierarchical appropriate propositions is fundamental to the scientifically valid and legally relevant evaluation of forensic biology results. The protocols outlined in this document provide a structured approach to avoiding pseudo-activity and vague terminology in proposition development. By implementing case-specific assessment, hierarchical positioning, and Bayesian network modeling, forensic scientists can significantly enhance the transparency and robustness of their evaluative processes. The experimental protocols further support this framework by generating empirical data on transfer and persistence phenomena essential for activity-level evaluation. As forensic genetics continues to evolve toward addressing activity-level questions, the critical examination and refinement of proposition formulation practices remains essential for the safe administration of justice.

Within research frameworks investigating source level versus activity level propositions, effective resource allocation and expert knowledge building present significant operational challenges. Source level propositions typically address questions about the origin of biological material, while activity level propositions help courts understand how biological material was transferred through specific activities [23] [2]. This distinction creates unique resource management demands in forensic and drug development contexts, requiring sophisticated approaches to allocating specialized personnel and building experimental knowledge bases.

The evaluation of biological traces considering activity level propositions necessitates careful consideration of transfer, persistence, and background prevalence of biological material [2]. These complex evaluations demand both appropriate statistical frameworks and precisely allocated expert resources. Similarly, drug development master protocols require strategic resource allocation across multiple substudies to efficiently generate evidence about targeted therapies [42]. In both fields, the transition from simple source identification to activity assessment represents a significant challenge that impacts how resources should be allocated and expertise developed.

Resource Allocation Frameworks for Research Operations

Common Resource Allocation Challenges

Research organizations frequently encounter specific resource allocation problems that impact their operational efficiency and scientific output. The table below summarizes these challenges and their impacts on research activities.

Table 1: Resource Allocation Challenges in Research Organizations

Challenge	Impact on Research Operations	Relevant Context
Resource Overallocation and Underutilization	Decreased productivity, compromised quality, burnout among researchers [43]	Impacts quality of data collection for activity-level knowledge bases
Lack of Specialized Skills	Inefficiencies, delays, compromised project outcomes [43]	Limits ability to design experiments for transfer/persistence studies
Insufficient Resource Forecasting	Project delays, cost overruns, missed opportunities [43]	Affects long-term research on proposition hierarchies
Ineffective Workload Balancing	Overburdened resources, decreased productivity, increased stress [43]	Reduces quality of evaluative reporting in casework
Scheduling Conflicts	Overbooking, timeline disruptions, operational inefficiencies [44]	Impedes complex experimental designs requiring multiple specialists

Strategic Solutions for Research Resource Allocation

Implementing strategic resource allocation solutions directly enhances research quality and reliability, particularly when building knowledge bases for activity level proposition evaluations.

Capacity Planning and Resource Leveling: Research organizations should evaluate resource availability and workload capacity to prevent overcommitment of specialized personnel [43]. This is particularly important for designing experiments that form knowledge bases for activity level proposition evaluation, where consistent researcher attention is critical [23]. Establishing realistic timelines and allocating resources accordingly prevents overallocation, especially during data collection phases for transfer and persistence studies.

Advanced Forecasting Techniques: Utilizing historical data analysis, statistical modeling, and expert judgment improves resource forecasting accuracy [43]. For research on activity level propositions, this includes anticipating needs for specific expertise during different research phases, from experimental design to statistical evaluation using Bayesian networks [23]. Regular data monitoring and collaborative forecasting involving principal investigators ensure resource allocation aligns with research priorities.

Structured Skills Management: Maintaining a centralized skill inventory enables research organizations to track researcher competencies, certifications, and skill gaps in real-time [44]. This is essential for activity level research, which requires specialized knowledge in evidence evaluation, statistical analysis, and experimental design. Targeted training programs and strategic hiring based on forecasted needs ensure the organization can address the complex demands of source versus activity level proposition research.

Building Expert Knowledge for Activity Level Propositions

Experimental Design for Knowledge Base Development

Building robust knowledge bases for evaluating activity level propositions requires carefully designed experiments that generate relevant quantitative data on transfer, persistence, and prevalence of biological material.

Table 2: Experimental Parameters for Knowledge Base Development

Experimental Parameter	Data Collection Method	Proposed Analysis Approach
Transfer mechanisms	Controlled transfer studies under varying conditions [2]	Quantitative analysis of DNA transfer rates and patterns
Persistence timelines	Time-series sampling under different environmental conditions	Statistical modeling of degradation rates and persistence probabilities
Background prevalence	Systematic sampling of relevant environments [23]	Frequency distribution analysis and database development
Activity scenarios	Simulation of alleged activities with varying parameters [2]	Bayesian network analysis for likelihood ratio development
Shedder status	Quantification of DNA shedding rates across individuals	Classification models and impact assessment on transfer probabilities

The data generated from these experiments must be relevant to case-specific circumstances and sufficient for assigning probabilities within the likelihood ratio framework used for evaluating evidence under activity level propositions [23]. This requires researchers to design experiments that capture the variability present in real-world conditions while maintaining scientific rigor.

Data Summarization and Knowledge Representation

Quantitative data from knowledge-building experiments requires appropriate summarization and representation to be useful for evidence evaluation. Distribution of continuous data, such as transfer probabilities or persistence times, can be displayed using histograms with carefully selected bins to avoid ambiguity [45]. For discrete data, such as counts of specific transfer occurrences, frequency tables provide appropriate summaries.

Statistical summaries should include measurements of central tendency (mean, median) and variation (standard deviation, range) to adequately characterize the data [45]. These summaries form the basis for assigning probabilities when evaluating results given activity level propositions, enabling forensic scientists to assess the probability of the evidence under each competing proposition [23].

Integrated Protocols for Research Operations

Resource Allocation Protocol for Research Studies

Protocol Title: Strategic Resource Allocation for Activity Level Proposition Research Objective: To ensure optimal allocation of human and technical resources throughout the research lifecycle Duration: Ongoing with quarterly review cycles

Methodology:

Resource Identification and Classification:
- Catalog all available researchers, technical staff, and equipment
- Classify resources by specialized skills, availability, and current commitments
- Establish competency matrix tracking qualifications, experience, and proficiency levels [44]

Demand Forecasting:
- Analyze projected research needs based on study pipeline
- Utilize historical data analysis and statistical modeling to predict resource requirements [43]
- Conduct scenario planning for potential changes in research priorities
Allocation Optimization:
- Assign resources based on skill-fit, availability, and development goals
- Implement capacity planning to prevent overcommitment [43]
- Establish resource leveling procedures to balance workloads across the research team
Monitoring and Adjustment:
- Track resource utilization against planned allocations
- Conduct regular reviews to identify overallocation or underutilization
- Adjust allocations based on research progress and emerging priorities

Knowledge Building Protocol for Activity Level Evaluations

Protocol Title: Experimental Design for Transfer and Persistence Knowledge Bases Objective: To generate reliable quantitative data supporting activity level proposition evaluations Duration: Study-specific with ongoing knowledge integration

Methodology:

Proposition Formulation:
- Define specific activity level propositions relevant to casework
- Identify key variables requiring empirical data (transfer probabilities, persistence times, background levels)
- Establish explicit propositions before knowledge of results to avoid cognitive bias [23]

Experimental Design:
- Develop controlled experiments that mimic activities of interest
- Incorporate relevant variables such as shedder status, contact duration, and substrate characteristics [2]
- Ensure experimental conditions cover realistic variability encountered in casework
Data Collection and Management:
- Implement standardized protocols for data recording
- Utilize appropriate quantitative measures and units
- Establish database infrastructure for systematic knowledge accumulation
Data Analysis and Implementation:
- Apply statistical analyses to derive probability distributions
- Develop Bayesian networks to represent relationships between variables [23]
- Integrate findings into evaluative frameworks for casework application

Visualization of Research Operational Frameworks

Resource Allocation Workflow

Resource Allocation Workflow for Research Operations

Knowledge Building Framework

Knowledge Building Framework for Activity Level Propositions

Research Reagent Solutions for Experimental Studies

Table 3: Essential Research Reagents for Transfer and Persistence Studies

Reagent/ Material	Function in Research	Application Context
DNA Extraction Kits	Isolation of genetic material from various substrates	Recovery of DNA from transfer surfaces for quantification
Quantitative PCR Reagents	Measurement of DNA quantity and quality	Assessment of DNA transfer amounts and degradation rates
Statistical Analysis Software	Data analysis and likelihood ratio calculation	Bayesian network implementation for activity level evaluation [23]
Controlled DNA Sources	Standardized biological material for transfer studies	Experimental simulation of activities under controlled conditions
Substrate Collection Kits	Standardized sampling from various surfaces	Consistent data generation across multiple experiments
Database Management Systems	Storage and retrieval of experimental data	Formation of knowledge bases for probability assignment [23]

Assessing Probative Value and Legal Impact Across Proposition Levels

In forensic science, the interpretation of evidence does not occur in a vacuum; it is fundamentally guided by the propositions (or hypotheses) put forward by the prosecution and defense. These propositions can be arranged in a hierarchy, with source-level and activity-level propositions representing two critical distinct tiers [2] [1]. Source-level propositions address the question of "Whose DNA is this?" by considering the source of a biological stain [46] [47]. In contrast, activity-level propositions address the more complex question of "How did this DNA get there?" by evaluating the activities that led to the deposition of the biological material [2] [5]. The distinction is paramount because a likelihood ratio calculated for a source-level proposition cannot be carried over to an activity-level context without potentially causing severe misinterpretation of the evidence's true probative value [46] [5] [47]. This application note provides a structured framework for researchers and forensic scientists to determine when a source-level assessment is sufficient and when it becomes misleading, necessitating an activity-level evaluation.

Theoretical Framework and Decision Matrix

The core of the proposition hierarchy lies in the questions each level seeks to answer. The following table delineates the defining characteristics, appropriate use cases, and limitations of source-level propositions.

Table 1: Characteristics and Application of Source-Level Propositions

Aspect	Source-Level Propositions
Core Question	"Whose DNA is this?" or "Is the person of interest the source of this recovered DNA?" [46] [1]
Typical Form	Prosecution: "The DNA came from the Person of Interest (POI)."Defense: "The DNA came from an unknown person." [2]
Factors Considered	Primarily the rarity of the DNA profile in a relevant population [2] [5].
Sufficient When	The source of the DNA is the only disputed issue, and case circumstances indicate that the presence of the DNA is directly and unequivocally related to the criminal activity. There is no viable alternative activity that could explain the presence of the DNA [5].
Becomes Misleading When	The issue is not the source, but rather the mechanism of transfer (e.g., how the DNA was deposited). This is common in cases with low-level DNA, transfer via innocent contact, or the presence of background DNA [2] [5] [6].

The decision to use a source-level proposition is not merely a technical choice but a critical risk assessment. Relying on a source-level proposition when an activity-level is required can be highly misleading. For instance, a source-level Likelihood Ratio (LR) in the order of >10²⁰ might be reported, while the strength of the findings given activity-level propositions—considering transfer, persistence, and background—could be substantially more moderate [5]. This overstatement of evidence value can lead to miscarriages of justice.

Table 2: Scenarios Differentiating Source and Activity-Level Assessments

Scenario	Appropriate Level	Rationale
A large, fresh bloodstain at a point of entry in a burglary; the suspect denies ever being on the premises.	Source-Level [5]	The appearance and location of the stain directly link it to the crime. The only dispute is the identity of the person who bled. Alternative mechanisms for the stain's presence are not reasonably postulated.
Low-level DNA from a co-worker found on a victim's collar in an alleged assault; the suspect admits to a recent friendly handshake.	Activity-Level [2] [5]	The source of the DNA is not contested. The core issue is the activity that caused the transfer (a punch vs. a handshake). A source-level assessment would be irrelevant and misleading.
DNA from a suspect is found on a handled object (e.g., a weapon). The defense claims the suspect handled the object innocently days before the crime.	Activity-Level [2]	The dispute is not about who touched the object, but when and why. Evaluating this requires considering DNA persistence and background.

The following diagram illustrates the logical decision process a forensic scientist should employ when determining the appropriate level of proposition for a case.

Application Notes: Protocols for Case Assessment and Evaluation

Case Assessment and Pre-Evaluation Protocol

Objective: To systematically review case information and define the appropriate propositions and evaluation strategy before conducting DNA analysis to avoid cognitive bias [46] [1].

Workflow:

Case Information Review: Collect all available factual information from the mandating authority. This includes the alleged activities, the timeline, the location and nature of the recovered items, and any statements from the person of interest (POI) or victim [1].
Define the Issue at Hand: In dialogue with the mandating authority, identify the core question the court needs to resolve. Is it purely "Who is the source?" or is it "How did the DNA get there?" [5].
Formulate Propositions: Based on the core question, formulate a pair of mutually exclusive propositions.
- If the issue is source, use source-level propositions (e.g., H₁: The POI is the source of the DNA vs. H₂: An unknown person is the source) [46].
- If the issue involves actions or mechanisms, use activity-level propositions (e.g., Hₚ: The POI punched the victim vs. Hᵈ: The POI and victim only shook hands) [2].
Identify Required Data and Knowledge: For activity-level propositions, identify the factors needed for evaluation (e.g., probabilities of transfer, persistence, prevalence of background DNA) and assess the availability of relevant data from controlled experiments or the scientific literature [2] [6].
Document the Rationale: Transparently document the chosen propositions and the justification for the selected level, creating an audit trail [1].

Protocol for Quantitative Evaluation of Activity-Level Propositions

Objective: To quantitatively assess the probative value of forensic findings given activity-level propositions using a Likelihood Ratio (LR) framework that incorporates transfer, persistence, and background (TPB).

Workflow:

Define the LR Framework: The LR is the probability of the evidence (E) under the prosecution proposition (Hₚ) divided by its probability under the defense proposition (Hᵈ): LR = P(E | Hₚ) / P(E | Hᵈ) [14].
Break Down the Evidence (E): The evidence is not just the DNA profile. It includes:
- The DNA profile itself and its compatibility with the POI.
- The quantity of DNA recovered.
- The location where the DNA was found.
- The presence or absence of DNA from other individuals (e.g., the victim, unknown contributors) [5] [6].
Assign Probabilities for TPB: This is the most critical and complex step. Probabilities must be assigned based on empirical data and expert knowledge.
- Transfer: P(E | Hₚ) must account for the probability of DNA transferring from the POI to the specific location via the alleged activity (e.g., punching). P(E | Hᵈ) must account for the probability of the DNA being present via an alternative activity (e.g., handshaking) or as background [2] [6].
- Persistence: Consider the time between the alleged activity and sample collection, and how this affects the detectability of DNA [2].
- Background: Assess the probability of finding DNA from the POI, or a matching unknown profile, as background on the item. This is crucial when evaluating unknown contributors that appear on multiple items, as they may originate from a common source (e.g., an alternate offender) or from different individuals by chance [6].
Calculate the Likelihood Ratio: Compute the LR using the assigned probabilities. Sensitivity analysis should be performed to test how changes in the probability assignments affect the final LR, highlighting the robustness (or uncertainty) of the conclusion [2].
Report and Communicate: The report must clearly state the propositions, the factors considered (TPB), the sources of data used for probability assignments, and the computed LR. The limitations and assumptions must be transparently communicated [1].

The Scientist's Toolkit: Reagents and Computational Models

Table 3: Essential Research Reagents and Models for Proposition Evaluation

Tool / Reagent	Function / Explanation
DNA Profiling Kits (e.g., STR Multiplex Kits)	Generate the DNA profile from the trace material. The primary output for source-level comparisons and profile rarity calculations [46].
Sensitive Detection Chemistries	Enable the detection of low-template DNA, which is common in activities involving casual contact and is a key trigger for moving to activity-level considerations [2] [5].
Probabilistic Genotyping Software	Used to interpret complex DNA mixtures and calculate LRs for source-level propositions, providing a statistical weight for the DNA profile match [46] [5].
Bayesian Networks (BNs)	A graphical probabilistic model that represents the dependencies between variables. BNs are powerful computational tools for structuring the complex reasoning and combining multiple pieces of evidence required for activity-level evaluations [14] [48] [6].
Chain Event Graphs (CEGs)	An extension of Bayesian networks that is particularly adept at modeling asymmetric, time-ordered sequences of activities. CEGs help frame the LRs needed for complex activity-level propositions by mapping out all possible event pathways [14].

The following diagram visualizes a generic Bayesian Network structure for evaluating activity-level propositions, incorporating the key concepts of transfer, persistence, and background.

The choice between source-level and activity-level propositions is a cornerstone of logically sound and forensically relevant evidence evaluation. Source-level propositions are sufficient and powerful only when the source of the DNA is the sole matter of dispute and its link to the criminal act is unambiguous. In the modern era of sensitive DNA detection, where trace evidence is readily transferred through innocent activities, a source-level assessment is often inadequate and can be profoundly misleading. For researchers and practitioners, the mandatory shift to activity-level reasoning is required whenever the questions of "how," "when," or "why" the DNA was deposited are central to the case. Adhering to structured protocols, leveraging appropriate computational tools, and maintaining transparency in probability assignments are essential for ensuring that the true probative value of forensic DNA evidence is communicated to the courts.

Within forensic science, a fundamental distinction exists between source level propositions and activity level propositions. Source-level analysis asks, "What is the origin of this trace?" while activity-level analysis addresses, "How did this trace get here, and what activity does it represent?" [3]. This application note demonstrates that the Likelihood Ratio (LR), a measure of probative value, can differ significantly between these two levels of analysis. We present a structured, data-driven protocol to quantify this probative value gap, enabling researchers and legal professionals to assess forensic evidence with greater precision and contextual accuracy.

Core Conceptual Framework

The evaluation of forensic evidence is structured by a hierarchy of propositions [3]:

Source Level: Pertains to the biological or physical origin of a trace (e.g., "The DNA profile originates from the suspect").
Activity Level: Pertains to actions or events (e.g., "The suspect placed the bottle in the ceiling" vs. "An unknown person did").

The probative value gap arises because a piece of evidence that is extremely rare and thus strongly supportive at the source level (e.g., a matching DNA profile) may have its strength considerably moderated when assessed at the activity level. This moderation incorporates factors like the possibility of transfer, persistence, recovery (TPR), and the presence of background levels of DNA, which are not considered in a simple source assessment [2].

Experimental Protocol for Quantifying the Probative Value Gap

This protocol provides a step-by-step methodology for constructing Bayesian Network (BN) models to compare LRs calculated under source-level and activity-level propositions.

Case Circumstances & Information Review

Objective: To define the framework of circumstances (I) that will condition the entire evaluation.
Procedure:
- Collect all relevant case information from the submitting agency.
- Document the prosecution's and defense's alleged versions of events.
- Identify the specific items and their locations, the timing of events, and any other contextual details.
Example: In a case involving bottles filled with petrol, key information includes: the defendant was in the toilet when the fire was lit; the bottles were found hidden in a ceiling space; and the bottles were handled without touching the lid [6].

Proposition Formulation

Objective: To define two mutually exclusive propositions for prosecution (Hp) and defense (Hd).
Procedure:
- Source Level Propositions:
  - Hp: The DNA on the item came from the person of interest (POI).
  - Hd: The DNA on the item came from an unknown person.
- Activity Level Propositions:
  - Hp: The POI performed the specific activity (e.g., placed the bottle).
  - Hd: An unknown person performed the activity.
Note: Activity level propositions must specify alleged activities, not just transfer mechanisms [3].

Bayesian Network Model Construction

Objective: To build a transparent graphical model that represents the probabilistic relationships between case hypotheses and findings.
Procedure:
- Identify Key Nodes: These represent the variables and findings.
  - H: The main activity-level proposition node.
  - Transfer: Probability of DNA transfer given the activity.
  - Background: Probability of finding background DNA on the item.
  - DNA Result: The main finding (e.g., a DNA match).
- Define Node States: For each node, define its possible states (e.g., "True"/"False" for H; "Yes"/"No" for Transfer).
- Define Conditional Probabilities: For each node, specify the probability of its states given the states of its parent nodes. These assignments are informed by case information, relevant scientific literature, and empirical data from TPR studies [21] [2].
- Extend for Multiple Items: For cases with multiple similar items, incorporate nodes for a "Common Unknown Donor" to assess the probability that the same unknown person contributed DNA to multiple items, versus different unknown persons [6].

Probability Assignment and LR Calculation

Objective: To populate the BN with probabilities and compute the LR.
Procedure:
- Enter Findings: Set the state of the DNA Result node to "Match."
- Source Level LR: Run the BN with H representing source-level propositions. The LR is calculated as P(Match | Hp_source) / P(Match | Hd_source).
- Activity Level LR: Run the BN with H representing activity-level propositions. The LR is now P(Match | Hp_activity) / P(Match | Hd_activity).
- Quantify the Gap: Calculate the ratio between the Source Level LR and the Activity Level LR. This demonstrates how much the probative value changes when moving from a source-centric to an activity-centric view.

Case Example and Data Analysis: R v QUIST

We apply the protocol to a published case, R v QUIST, to demonstrate a quantitative probative value gap [6].

Circumstances: Six bottles filled with petrol were found in a ceiling space after a fire. The defendant's DNA was found on five bottles, alongside unknown DNA [6].
Source Level Propositions:
- Hp: The defendant is the source of the DNA on the bottles.
- Hd: An unknown person is the source of the DNA on the bottles.
Activity Level Propositions:
- Hp: The defendant filled the bottles with petrol and placed them in the ceiling.
- Hd: An unknown offender filled the bottles with petrol and placed them in the ceiling.

Constructed Bayesian Network

The BN for this case incorporates nodes for the main activity proposition, DNA transfer from the actor, background DNA presence, and—critically—a common unknown source to account for the same unknown DNA profile appearing on multiple bottles [6].

Quantitative Results and Probative Value Gap

The table below summarizes the Likelihood Ratio (LR) outcomes for the case, demonstrating the significant probative value gap.

Table 1: Quantitative LR Comparison for R v QUIST Case

Proposition Level	Likelihood Ratio (LR)	Probative Value Interpretation
Source Level	~1 in a quintillion (10¹⁸)	Extremely strong support for Hp (defendant is source)
Activity Level	~200	Moderately strong support for Hp (defendant placed bottles)
Probative Value Gap	~5 x 10¹⁵ Factor Reduction	Activity level LR is quintillions of times lower than source level LR

The extreme difference in LRs, a reduction by a factor of quintillions, quantitatively demonstrates the probative value gap. The source-level LR, based purely on profile rarity, suggests essentially conclusive evidence. However, the activity-level LR is dramatically lower because it incorporates the real-world possibility of innocent transfer of the defendant's DNA (as he was present in the toilet) and the presence of a common unknown donor's DNA on multiple bottles, which is a scenario that must be explained under both propositions [6].

The Scientist's Toolkit: Key Research Reagent Solutions

Table 2: Essential Materials and Analytical Tools for Probative Value Research

Item	Function in Research
Bayesian Network Software (e.g., Hugin, Netica)	Provides the computational environment to construct probabilistic models, input conditional probabilities, and calculate Likelihood Ratios based on case scenarios.
Transfer, Persistence, Recovery (TPR) Datasets	Empirical data from controlled studies used to inform the probability assignments for transfer, background presence, and recovery nodes within the Bayesian Network.
DNA Profile Frequency Data	Population statistics used to calculate the source-level LR and to assess the probability of randomly matching a background DNA profile.
"Fit-for-Purpose" Validation Framework	A principle for guiding the extent of method validation based on the intended use of the data, ensuring the evaluation model is robust and appropriate for the case at hand [49] [50].

This protocol establishes a clear, reproducible method for quantifying the often-overlooked disparity between the apparent strength of forensic evidence at the source level and its actual strength when contextualized within activity-level propositions. The case example of R v QUIST provides a stark quantitative demonstration: evidence with a quintillion-to-one LR at the source level can be reduced to a several-hundred-to-one LR at the activity level. Researchers and practitioners are urged to adopt this structured, BN-based approach to evidence evaluation. This ensures that the probative value of forensic findings is not overstated and is communicated to the justice system in a transparent, logical, and balanced manner.

Evaluative reporting in scientific research, particularly when transitioning from straightforward source-level propositions to complex activity-level propositions, demands a robust methodological framework. Source-level propositions concern the origin of evidence, such as "the person of interest is the source of the crime stain" versus "an unknown person is the source" [2]. In contrast, activity-level propositions address how evidence came to be in its context, such as "Mr. A punched the victim" versus "The person who punched the victim shook hands with Mr. A" [2]. This shift significantly increases evaluation complexity, requiring careful consideration of transfer, persistence, and background prevalence of evidence [2].

Methodological robustness serves as the cornerstone of reliable and trustworthy outcomes in this process, ensuring that evaluation methods remain strong and dependable across varying conditions [51]. In the context of pharmaceutical research, similar challenges emerge when evaluating Value-Added Medicines (VAMs), where a core evaluation framework must capture diverse benefits including unmet medical needs, health gain, patient-reported outcomes, and burden on healthcare systems [52]. This article establishes comprehensive protocols for validating methodological robustness across forensic and pharmaceutical domains, addressing the critical balance between scientific rigor and practical feasibility in evaluative reporting.

Theoretical Foundation: Proposition Hierarchies in Evaluative Reporting

Distinguishing Source-Level and Activity-Level Propositions

The evolution from source-level to activity-level propositions represents a fundamental shift in evaluative focus. Source-level analysis primarily concerns itself with evidence origin and relies heavily on population rarity statistics [2]. This approach proves sufficient when questions pertain purely to identification but becomes inadequate when contextual factors influence evidence interpretation.

Activity-level propositions introduce contextual complexity requiring consideration of transfer mechanisms, persistence factors, and background prevalence [2]. For example, DNA transfer dynamics become crucial when evaluating whether DNA presence results from primary transfer (direct contact) or secondary transfer (indirect contact). The probative value of evidence shifts dramatically when moving between these proposition levels, necessitating more sophisticated evaluation frameworks that incorporate contextual factors beyond simple identification.

Methodological Robustness as a Unifying Concept

Methodological robustness encompasses the strength and dependability of evaluation processes, ensuring consistent results despite minor methodological variations or contextual shifts [51]. This concept transcends disciplines, applying equally to forensic evidence evaluation and pharmaceutical value assessment.

Key elements of methodological robustness include:

Clear Definition and Scope: Precise delineation of evaluation purposes and system boundaries
Data Quality Management: Implementation of validation processes and uncertainty quantification
Transparency and Documentation: Comprehensive documentation enabling scrutiny and replication
Contextual Sensitivity: Appropriate adaptation to varying conditions while maintaining reliability [51]

In pharmaceutical evaluation, robustness manifests through structured frameworks assessing VAMs across multiple value domains, including efficacy, safety, patient experience, adherence, quality of life, and economic impact on both households and healthcare systems [52] [53].

Application Notes: Implementing Robust Evaluation Frameworks

Core Evaluation Framework for Value-Added Medicines (VAMs)

The development of a core evaluation framework for VAMs demonstrates the practical application of methodological robustness principles. Through systematic literature review and expert validation, researchers established a structured approach encompassing 11 value domains grouped into 5 thematic clusters [52]:

Table 1: Core Value Assessment Framework for Value-Added Medicines

Thematic Cluster	Value Domains	Measurement Considerations
Unmet Medical Needs	1. Extending treatment options in new indication with unmet medical need	Addresses neglected areas where existing treatments are inadequate [52]
Health Gain	2. Individual needs/special needs of patient (sub)population3. Efficacy/Effectiveness4. Patient safety and tolerability	Measured by healthcare professionals; includes clinical outcomes and safety profiles [52]
Patient-Reported Outcomes	5. Patient experience related to the therapy6. Adherence and Persistence7. Quality of life	Captures patient perspective on treatment experience and outcomes [52]
Burden on Households	8. Patient's economic burden9. Economic and health burden on informal caregiver	Assesses direct and indirect costs to patients and families [52]
Burden on Health Care System	10. Health care resource utilization, costs or efficiency11. Technological improvement with logistical considerations	Evaluates system-level impact and efficiency improvements [52]

This framework reduces heterogeneity in value assessment processes across different jurisdictions while creating incentives for manufacturers to invest in incremental innovation [52]. The approach balances comprehensive coverage with practical applicability, acknowledging that some domains may require adaptation to specific national contexts.

Decision-Making Contexts for Framework Implementation

The core evaluation framework can be adapted to various decision-making contexts, reflecting the need for methodological flexibility in different policy environments:

Deliberative Processes: In systems relying on expert deliberation, the framework provides structured guidance for exempting VAMs from generic pricing mechanisms or justifying price premiums based on demonstrated value [53].
Augmented Cost-Effectiveness Analysis: For jurisdictions mandating cost-effectiveness analysis, the framework's domains can be incorporated as additional benefits, cost modifiers, or threshold adjustments, moving beyond traditional quality-adjusted life year (QALY) metrics [53].
Multi-Criteria Decision Analysis (MCDA): The framework can be operationalized through MCDA methodologies that assign weights to different criteria, transforming ad hoc decisions into transparent, replicable processes [53]. This approach aligns with developing trends in health technology assessment for specialized healthcare technologies.

The Chinese framework for clinical comprehensive evaluation of drugs demonstrates a similar structured approach, evaluating drugs across six first-level indicators: safety, efficacy, costs/cost-effectiveness, novelty, suitability, and accessibility [54]. This framework further refines second-level indicators specific to drug classes and diseases, employing Delphi methods and Analytic Hierarchy Processes to establish weighting through expert consensus [54].

Experimental Protocols for Validating Methodological Robustness

Protocol 1: Delphi Method for Expert Consensus Building

The Delphi method enables structured communication among experts to achieve consensus on evaluation criteria and weighting, particularly valuable for establishing robust frameworks in complex domains with limited quantitative data.

Table 2: Research Reagent Solutions for Delphi Method Implementation

Research Reagent	Function	Implementation Considerations
Expert Panel	Provide specialized knowledge and practical experience	Recruit 10-100 experts across relevant disciplines (e.g., clinical pharmacists, physicians, economists) [54]
Structured Questionnaire	Solicit independent opinions on framework components	Develop using professional online survey tools; include rating scales and open-ended justification [54]
Controlled Feedback Mechanism	Share aggregated group responses while maintaining anonymity	Provide statistical representation of group responses and reasons for judgments between rounds [54]
Pre-defined Consensus Threshold	Establish objective criteria for agreement	Typically set at 70% agreement on components; may vary by study requirements [54]

Procedure:

Expert Recruitment: Assemble a purposive sample of 10-100 experts with minimum 5 years of relevant professional experience [54]. Ensure representation across key stakeholder groups.
Initial Framework Draft: Develop a preliminary framework based on systematic literature review, identifying potential value domains and assessment criteria [52].
Iterative Rating Rounds:
- Distribute structured questionnaires for experts to rate importance of framework components
- Collect and analyze responses statistically
- Share anonymized group feedback with participants
- Conduct subsequent rounds until achieving pre-defined consensus threshold [54]
Consensus Validation: Convene expert workshops to review and validate the refined framework, discussing areas of persistent disagreement [52].

Protocol 2: Multi-Criteria Decision Analysis (MCDA) Implementation

MCDA provides a systematic approach for evaluating alternatives against multiple, often conflicting criteria, making it particularly suitable for assessing VAMs or evaluating forensic propositions with complex value dimensions.

Procedure:

Criteria Definition: Select relevant value domains from the core framework based on the specific evaluation context [53].
Weighting Assignment: Determine relative importance of criteria through:
- Expert elicitation using point allocation or pairwise comparison
- Analytic Hierarchy Process (AHP) for structured weighting [54]
- Stakeholder preference surveys
Scoring Function Development: Establish clear scoring rules for each criterion, defining measurement approaches and performance scales [53].
Alternative Assessment: Score each intervention or proposition against all criteria using available evidence.
Aggregate Value Calculation: Compute weighted scores across all criteria to generate overall value estimates.
Uncertainty Analysis: Conduct sensitivity analyses to test robustness of results to weighting and scoring variations [53].

Protocol 3: Evidence Synthesis for Activity-Level Proposition Evaluation

Evaluating activity-level propositions requires synthesizing diverse evidence types to address transfer, persistence, and background prevalence considerations.

Procedure:

Proposition Formulation: Define competing activity-level propositions based on case circumstances or stakeholder positions [2].
Transfer Mechanism Analysis:
- Design controlled experiments to study DNA transfer under varying conditions
- Document transfer probabilities for different activity scenarios
- Identify key factors influencing transfer rates (e.g., shedder status, contact duration) [2]
Persistence Studies: Establish degradation patterns for evidence types under different environmental conditions.
Background Prevalence Assessment: Collect data on evidence prevalence in relevant populations and environments.
Likelihood Ratio Calculation: Compute likelihood ratios for observed evidence under competing propositions, incorporating transfer, persistence, and background factors [2].

Visualization of Methodological Frameworks

Logical Workflow for Evaluative Reporting

Diagram 1: Evaluative Reporting Workflow

Proposition Hierarchy in Forensic Evaluation

Diagram 2: Forensic Proposition Hierarchy

Discussion: Balancing Robustness and Practicality

Implementing methodologically robust evaluation frameworks presents significant practical challenges, particularly regarding evidence generation and resource requirements. In pharmaceutical evaluation, the evidence base for Value-Added Medicines often differs from traditional originator pharmaceuticals, frequently relying on real-world evidence rather than large-scale pivotal clinical trials [52] [53]. Similarly, in forensic evaluation, activity-level propositions require data on transfer and persistence mechanisms that may be limited or context-dependent [2].

Potential solutions include:

Adaptive Evidence Requirements: Tailoring evidence expectations to the specific value claims and decision contexts, accepting well-designed observational studies or real-world evidence for certain domains [53].
Structured Uncertainty Management: Explicitly quantifying and communicating uncertainties through sensitivity analyses and scenario testing [51].
Iterative Framework Development: Using Delphi methods and expert consensus to refine evaluation criteria and weighting based on accumulating experience [54].
Cross-Disciplinary Learning: Transferring methodological insights between domains, such as applying multi-criteria decision analysis from healthcare to forensic evaluation [53].

The tension between scientific ideal and practical constraints necessitates balanced approaches that maintain methodological robustness while acknowledging implementation realities. This balance is particularly crucial when evaluation results inform high-stakes decisions in healthcare resource allocation or legal proceedings.

Validating methodological robustness in evaluative reporting requires systematic approaches that maintain balance, logic, and transparency across diverse application contexts. The structured frameworks and experimental protocols presented provide concrete methodologies for enhancing evaluative practice in both pharmaceutical and forensic domains. As evaluation questions evolve from simple source-level attribution to complex activity-level explanations, methodological frameworks must correspondingly advance to address additional contextual factors and uncertainties. By implementing robust, transparent evaluation processes grounded in structured methodologies, researchers and evaluators can enhance the credibility and utility of their conclusions across diverse decision-making contexts.

Forensic science operates within a hierarchical framework of propositions, ranging from source-level to activity-level inquiries. Source-level propositions address the origin of biological material, traditionally focusing on questions like "Did this DNA come from this suspect?" [2]. In contrast, activity-level propositions address more complex questions about how evidence arrived at a crime scene, dealing with "how" and "when" specific actions occurred [10]. This evolution from source to activity level represents a critical advancement in forensic science, moving beyond mere identification to reconstructing sequences of events.

Despite the judicial system's increasing need for activity-level interpretation, significant methodological and educational barriers impede its effective implementation in courtrooms globally [10]. This application note examines these challenges and provides structured protocols to help researchers and forensic practitioners generate robust, defensible activity-level evaluations suitable for judicial proceedings.

The Critical Gap: Judicial Need Versus Scientific Capability

The Changing Nature of Forensic Evidence

Advanced DNA technologies now enable analysis of minimal trace quantities, making source identification increasingly routine but less forensically decisive [2]. As one research paper notes, there is now a shift from the question "whose DNA is this?" to the question "how did it get there?" [2]. This paradigm shift makes activity-level assessment essential for contextualizing findings within alleged criminal activities.

The Judicial Demand for Activity-Level Interpretation

Courts frequently encounter situations where activity-level assessment is essential for accurate case resolution. Practitioners report facing activity-focused questions "on the witness stand with increasing frequency" [10]. Despite this demand, many forensic scientists remain reluctant to address activity-level propositions due to perceived methodological limitations and data deficiencies [2].

Table 1: Comparison of Source-Level vs. Activity-Level Propositions

Aspect	Source-Level Propositions	Activity-Level Propositions
Core Question	"Whose DNA is this?" [2]	"How did the DNA get there?" [2] [23]
Focus	Identity of biological material	Mechanisms and timing of transfer
Key Metrics	Profile rarity in population [2]	Transfer mechanisms, persistence, background prevalence [2]
Data Requirements	Population frequency databases [2]	Transfer probabilities, background levels, degradation rates [2]
Typical Output	Random match probability	Likelihood ratio addressing activities [23]

Methodological Framework for Activity-Level Assessment

Proposition Formulation Protocol

Proper proposition formulation is fundamental to robust activity-level assessment. The International Society for Forensic Genetics recommends these steps [23]:

Define competing propositions before knowledge of analytical results
Avoid transfer terminology in proposition wording
Ensure propositions are legally relevant and address the issue before the court
Frame propositions at the same hierarchical level for balanced comparison

Example Application: In a case involving alleged assault, appropriate propositions might be:

Prosecution proposition: The suspect grabbed the victim's shirt collar
Defense proposition: The suspect and victim shook hands at a social event earlier that day [2]

Quantitative Likelihood Ratio Framework

The likelihood ratio (LR) provides a quantitative measure of evidential strength, calculated as [14]:

Where E represents the forensic findings, H1 represents the prosecution proposition, and H2 represents the defense proposition [14].

Table 2: Likelihood Ratio Interpretation Guide

LR Value	Support for H1	Verbal Equivalent
>10,000	Very strong	Extremely strong support for prosecution proposition
1,000-10,000	Strong	Strong support for prosecution proposition
100-1,000	Moderately strong	Moderate support for prosecution proposition
1-100	Limited	Limited support for prosecution proposition
1	No support	Evidence equally supports both propositions
<1	Support for H2	Support for defense proposition

Experimental Protocols for Activity-Level Evaluation

Transfer and Persistence Study Protocol

Purpose: To generate quantitative data on DNA transfer and persistence relevant to specific activities.

Materials:

Donor samples (varying shedder status)
Receptor surfaces (fabric, glass, metal, etc.)
Environmental control chamber
DNA quantification system
Statistical analysis software

Methodology:

Standardize donor DNA collection using swabs from palms and fingers
Establish baseline shedder status through controlled deposition studies
Simulate specific activities (e.g., grabbing, pushing, handling) with varying:
- Contact pressure (measured by calibrated instruments)
- Contact duration (0.5s to 60s)
- Surface characteristics
Sample at multiple time points post-activity (0h, 2h, 6h, 24h, 48h)
Process samples using standard DNA extraction and quantification methods
Analyze data using mixed-effects models to account for donor variability

Data Interpretation: Results should be expressed as probability distributions for transfer and persistence under controlled conditions, noting limitations for direct application to casework.

Background Prevalence Assessment Protocol

Purpose: To determine the frequency and composition of DNA background on various surfaces in different environments.

Materials:

Surface sampling kits
High-sensitivity DNA profiling systems
Environmental data loggers (temperature, humidity)
Demographic information databases

Methodology:

Select sampling locations representing various environments (domestic, public, occupational)
Sample high-contact surfaces using standardized swabbing techniques
Process samples using ultra-sensitive DNA analysis methods
Record contextual metadata for each sample:
- Location type and specific surface material
- Recent cleaning history
- Ambient environmental conditions
- Usage patterns and traffic frequency
Analyze DNA profiles for mixture complexity and major contributor characteristics
Develop statistical models predicting expected background levels

Computational Modeling Framework

Bayesian Network Development for Activity Assessment

Bayesian Networks (BNs) provide a robust computational framework for evaluating complex activity scenarios by explicitly modeling dependencies between variables [23]. For activity-level assessment, BNs incorporate transfer mechanisms, background prevalence, and alternative activity scenarios.

Bayesian Network for DNA Transfer

Chain Event Graph Framework for Complex Scenarios

Chain Event Graphs (CEGs) offer enhanced capability for modeling asymmetric, time-ordered activity sequences that commonly occur in criminal scenarios [14]. CEGs maintain temporal sequencing while capturing conditional independence relationships.

Chain Event Graph for Activity Propositions

Implementation Toolkit for Forensic Researchers

Research Reagent Solutions

Table 3: Essential Materials for Activity-Level Research

Item	Function	Application Notes
Standardized donor panels	Provide controlled DNA source with known shedder status	Include representatives of high, medium, and low shedders; maintain ethical approval
Surface substrate kits	Representative materials for transfer studies	Include cotton, polyester, glass, wood, metal; characterize surface roughness and porosity
Environmental simulation chamber	Control temperature, humidity, and airflow	Enable study of environmental effects on persistence rates; calibrate regularly
Quantitative PCR systems	Precise DNA quantification	Essential for establishing transfer probabilities and degradation kinetics
Statistical analysis software	Data modeling and likelihood ratio calculation	Implement Bayesian statistical methods; validate all computational models

Case Assessment Protocol

A standardized workflow ensures comprehensive evaluation of activity-level propositions:

Case Context Analysis: Review all available case information, including alleged activities, timing, and locations
Proposition Development: Formulate competing prosecution and defense propositions at activity level
Relevant Data Identification: Select appropriate empirical data for the specific scenario and substrates
Likelihood Ratio Calculation: Compute LR using Bayesian network or CEG frameworks
Sensitivity Analysis: Test LR robustness to variations in key assumptions or uncertain parameters
Reporting: Present findings with clear explanation of limitations and underlying assumptions

Barriers to Implementation and Path Forward

Current Global Barriers

Multiple significant barriers hinder global adoption of activity-level evaluation in judicial settings [10]:

Methodological resistance: Skepticism about proposed evaluation frameworks
Data limitations: Lack of robust, impartial data to inform transfer probabilities
Regional differences: Varying regulatory frameworks and methodological standards
Resource constraints: Insufficient training and implementation resources
Interpretation complexity: Difficulty explaining sophisticated probabilistic reasoning in court

Research Priorities for Enhanced Judicial Utility

To address current limitations, the forensic research community should prioritize:

Expanded Transfer Databases: Develop comprehensive, shared databases of transfer probabilities for common activities and surfaces
Standardized Reporting Frameworks: Create clear guidelines for communicating activity-level conclusions in court testimony
Computational Tool Development: Build validated, user-friendly software implementing Bayesian networks and CEGs for casework
Education Programs: Develop training resources for both forensic practitioners and legal professionals on activity-level reasoning
Multi-Institutional Validation Studies: Conduct large-scale studies to establish reliability and reproducibility of activity-level assessments

Activity-level forensic assessment represents an essential evolution beyond traditional source identification, directly addressing the questions most relevant to modern judicial decision-making. While significant implementation challenges exist, the methodological framework and experimental protocols outlined in this application note provide researchers with validated approaches for generating robust, defensible activity-level evaluations. Through continued refinement of transfer probability databases, computational modeling tools, and standardized reporting frameworks, the forensic science community can overcome current barriers to provide courts with the sophisticated activity-level guidance increasingly demanded in complex criminal investigations.

The European Network of Forensic Science Institutes (ENFSI) is the premier organization for forensic science in Europe, founded in 1995 with the purpose of improving mutual information exchange and enhancing the quality of forensic science delivery across the continent. Recognized by the European Commission, ENFSI provides critical coordination across 17 different Expert Working Groups, making it a monopoly organization in the European forensic science landscape [55]. Through its working groups, ENFSI develops Best Practice Manuals (BPMs) and forensic guidelines aimed at standardizing procedures, ensuring quality principles, establishing training processes, and creating unified approaches to forensic examinations [56] [57]. These documents provide a essential framework for forensic practitioners, particularly in traditional feature-comparison disciplines such as handwriting analysis, fingerprint examination, and firearms identification.

The development of ENFSI guidelines represents a significant step toward addressing key challenges in forensic science, including limited standardization, lingering subjectivity, and ongoing skepticism regarding reliability in legal contexts [57]. These guidelines aim to unify practices and foster collaboration across the forensic science community, though they have historically faced implementation challenges in smaller laboratories and private practice settings. Recent initiatives under the FOR FUTURE project, aligned with the European Forensic Science Area 2.0 Action Plan, demonstrate ENFSI's continued commitment to strengthening methodological reliability through multi-disciplinary collaborative exercises, digital transformation, and enhanced quality assurance measures [58].

Source Level vs. Activity Level Propositions: Theoretical Framework

Defining the Hierarchy of Propositions

Forensic evaluation operates within a hierarchical framework of propositions that determine the scope and focus of expert analysis. Source-level propositions concern the origin of trace material and typically address questions such as "Does this bloodstain come from Mr. A?" or "Did this handwriting originate from a specific individual?" [2]. At this level, evaluation primarily requires assessing the rarity of the corresponding analytical features in a relevant population, utilizing well-established models and statistical data. The focus remains exclusively on identifying the source of the evidentiary material without considering how it arrived at a particular location or its connection to specific activities.

In contrast, activity-level propositions address more complex questions about events and actions, such as "Did Mr. A punch the victim?" or "Did the suspect handle the stolen object?" [2]. This elevated level of interpretation requires considering additional factors beyond source identification, including transfer mechanisms, persistence characteristics, background prevalence, and recovery efficiency. The evolution of forensic science, particularly DNA profiling technology capable of producing results from tiny quantities of trace material, has accelerated a paradigm shift from "Whose DNA is this?" to "How did it get there?" [2]. This transition reflects the legal system's growing need for assistance with evaluating the probative strength of forensic results when competing propositions refer to different activities rather than mere source identification.

Practical Implications for Forensic Evaluation

The distinction between source and activity level propositions has significant implications for forensic practice. Source-level evaluations benefit from relatively straightforward statistical approaches and established population databases, while activity-level assessments require integration of multiple complex factors including transfer probabilities, persistence mechanisms, and background prevalence data [2]. This complexity presents both challenges and opportunities for forensic practitioners, who must carefully consider the limitations of their conclusions when operating at different propositional levels.

Table 1: Key Differences Between Source and Activity Level Propositions

Aspect	Source Level Propositions	Activity Level Propositions
Focus Question	"Whose is it?"	"How did it get there?"
Evaluation Complexity	Lower	Higher
Required Data	Population statistics, feature rarity	Transfer, persistence, background prevalence data
Statistical Framework	Well-established	Developing
Common Applications	DNA profiling, fingerprint identification, handwriting comparison	Sexual assault cases, physical assault scenarios, transfer evidence

Current ENFSI Guidelines Across Forensic Disciplines

Handwriting Examination Best Practices

ENFSI-endorsed methodologies for handwriting examination emphasize structured approaches to reduce interpretative subjectivity and enable quantifiable measurement. The field has evolved from relying primarily on subjective expert judgment toward formalized frameworks that model the degree of similarity between handwriting samples through a two-stage process: feature-based evaluation and congruence analysis [57]. These stages produce quantitative markers that integrate into unified similarity scores, forming the foundation for complex comparisons involving multiple questions and known texts.

The proposed handwriting examination procedure follows eleven systematic steps: (1) pre-assessment and preliminary review of all materials; (2) feature evaluation of known documents; (3) determination of variation ranges; (4) feature evaluation of the questioned document; (5) similarity grading for features; (6) evaluation of handwriting elements; (7) calculation of feature-based similarity score; (8) congruence analysis of letterforms; (9) evaluation of congruence score; (10) calculation of total similarity score; and (11) expert conclusion formulation [57]. This comprehensive workflow ensures consistent application of analytical principles while providing transparency in methodology.

A critical component of modern handwriting examination involves quantitative assessment of specific features. The following workflow illustrates the standardized process for handwriting analysis:

DNA Evidence Interpretation Guidelines

ENFSI guidelines for DNA evidence evaluation have evolved significantly to address the challenges of activity-level propositions. While traditional DNA interpretation focused primarily on source attribution through profile rarity statistics, current best practices recognize the need for expanded frameworks that incorporate transfer mechanisms, persistence factors, and background prevalence [2]. This shift acknowledges that in many cases, the source of DNA may not be contested, while the mechanism of transfer remains central to legal questions.

The European guideline on evaluative reporting highlights the need for forensic scientists to engage with activity-level propositions despite perceived obstacles related to data limitations and complexity [2]. Recommended practices include using formal analyses of expressions for probative strength, incorporating sensitivity analyses to determine the impact of unknown factors, and developing specialized knowledge about transfer mechanisms that can inform evaluations even when exact activity parameters remain uncertain.

Quality Assurance and Collaborative Exercises

A cornerstone of ENFSI's approach to quality assurance involves multidisciplinary collaborative exercises designed to maximize forensic information recovery from single items by combining findings across different disciplines [58]. These exercises focus not on assessing single-discipline performance but on optimizing the integration of multiple forensic analyses to enhance overall evidentiary value. Additionally, ENFSI promotes regular collaborative exercises within specific domains, such as friction ridge analysis, to highlight the impact of different examination methods and evaluation approaches on identical testing samples [58].

Recent initiatives under the FOR FUTURE project aim to strengthen methodological reliability through paired approaches that combine human expertise with computer-assisted statistical tools. For example, in the friction ridge domain, ENFSI is working to reduce examiner variability during the ACE-V protocol implementation while pairing human judgments with score-based likelihood ratios for evaluative reporting [58]. This dual approach leverages both expert perception and quantitative assessment to enhance overall reliability.

Quantitative Approaches in Forensic Examination

Structured Frameworks for Handwriting Analysis

Recent research has developed structured frameworks for formalized and quantitative handwriting examination that directly support source-level proposition testing. These methodologies systematically model similarity between handwriting samples through quantitative assessment of specific features including letter size, connection forms, regularity, proportions, and spatial relationships [57]. The framework incorporates mathematical modeling to determine variation ranges across known samples and calculates similarity grades for questioned documents based on deviation from established ranges.

The quantitative assessment employs defined scales for specific handwriting characteristics. For example, letter size evaluation uses a seven-point scale ranging from "very small" (assigned value 1 for letters <1 mm) to "very large" (value 7 for letters >5.5 mm), with intermediate values representing small, rather small, medium, rather large, and large sizes [57]. Similarly, connection forms are classified across twelve distinct categories including angular connections, garlands, arcades, threads, and specialized forms, each assigned specific numerical values for systematic comparison.

Table 2: Quantitative Assessment Scale for Handwriting Features

Feature	Assessment Scale	Quantification Method
Letter Size	7-point scale (1-7)	Measurement of minimum 50-80% of letters
Connection Form	12 categories (0-11)	Classification based on dominant form
Size Regularity	Varied based on feature	Statistical variation across known samples
Letter Width	Defined point system	Proportional measurement
Inter-letter Intervals	3-5 point range	Spatial measurement standardization

Statistical Modeling and Likelihood Ratios

ENFSI's current trajectory emphasizes broader adoption of statistical modeling and likelihood ratios across forensic disciplines. The "Route towards Likelihood Ratio" project specifically targets forensic chemists, aiming to develop skills in chemometrics and likelihood ratio calculations to better address forensic questions at appropriate proposition levels [58]. This initiative includes developing new ENFSI guidelines with practical examples to bridge the gap between traditional chemometric approaches and formal likelihood ratio frameworks.

Complementing this work, the REACT II project focuses on generating crucial data for statistical evaluation of activity-level propositions, particularly regarding transfer, persistence, prevalence, recovery, and background probabilities of biological traces [58]. By addressing the perpetual concern about relevant, robust data availability, this project enables more widespread implementation of probabilistic reasoning in forensic evaluations, especially for DNA evidence interpreted at activity level.

Experimental Protocols and Methodologies

Protocol for Quantitative Handwriting Examination

Purpose: To provide a standardized methodology for the forensic examination of handwriting samples through quantitative feature analysis and congruence assessment, supporting both source-level and activity-level propositions.

Scope: Applicable to the examination of handwritten texts and signatures in forensic contexts, including legal investigations, document authentication, and authorship verification.

Materials and Equipment:

High-resolution scanner (minimum 600 dpi)
Digital calipers or precision measurement software
Specimen collection forms for feature documentation
Reference samples of known handwriting
Questioned documents for analysis
Computer with statistical analysis capabilities

Procedure:

Pre-assessment Phase
- Verify legibility and suitability of all documents for examination
- Confirm that known samples are genuinely representative of the purported author
- Assess contemporaneity of known samples with questioned writing
- Determine if sufficient material is available to assess natural handwriting variation
Feature Evaluation of Known Documents
- Conduct systematic analysis of defined handwriting features in each known sample
- Document measurement values for each feature according to standardized scales
- Record observed variations within and between known samples
Determination of Variation Ranges
- Establish minimum (Vmin) and maximum (Vmax) values for each feature across known samples
- Calculate statistical variation parameters for each handwriting characteristic
- Document the range of natural variation for the writer
Feature Evaluation of Questioned Document
- Assess the same set of predefined features in the questioned handwriting
- Apply identical measurement protocols and classification scales
- Document quantitative values (X-value) for each feature
Similarity Grading
- Compare questioned features to established variation ranges from known samples
- Assign similarity grades using standardized criteria:
  - Similarity grade = 0 if X-value is outside variation range (Vmin-Vmax)
  - Similarity grade = 1 if X-value is strictly inside variation range
  - Intermediate values for borderline cases
Calculation of Feature-based Similarity Score
- Aggregate individual handwriting element comparisons into cumulative score
- Apply weighting factors based on discriminative power of features
- Calculate normalized similarity score
Congruence Analysis
- Conduct detailed examination of each letter and allographic variants
- Analyze specific letter-pair combinations where necessary
- Evaluate consistency between corresponding letters in questioned and known samples
Evaluation of Congruence Score
- Quantify consistency between letterforms using defined metrics
- Calculate proportional agreement across all comparable characters
Total Similarity Score Calculation
- Integrate feature-based score and congruence score using defined algorithm
- Generate unified similarity metric supporting probabilistic conclusions
Expert Conclusion Formulation
- Interpret total similarity score in context of case circumstances
- Formulate expert opinion based on quantitative assessment and contextual factors
- Document limitations and assumptions in the analysis

Quality Control: Implement independent, blinded peer review of examination following ENFSI recommended practices [57]. Maintain comprehensive documentation of all measurements, calculations, and decision processes.

Protocol for Activity-Level DNA Evidence Assessment

Purpose: To provide a systematic framework for evaluating the probative value of DNA profiling results when competing propositions relate to different activities rather than source identification.

Scope: Applicable to DNA trace evidence interpretation in cases where transfer mechanisms rather than source identification are central to legal questions.

Materials and Equipment:

Standard DNA profiling instrumentation
Transfer probability estimation tools
Background prevalence databases
Persistence study data
Bayesian network modeling software

Procedure:

Proposition Formulation
- Define competing activity-level propositions based on case circumstances
- Identify relevant transfer mechanisms for each proposition
- Specify key variables affecting transfer probabilities
Data Collection
- Gather relevant data on transfer, persistence, and prevalence
- Consult experimental studies and empirical databases
- Identify data gaps and necessary assumptions
Transfer Probability Estimation
- Estimate probabilities of DNA recovery given each activity proposition
- Consider factors such as shedder status, contact duration, and force
- Incorporate appropriate uncertainty margins
Background Prevalence Assessment
- Evaluate likelihood of DNA presence unrelated to alleged activities
- Consider substrate-specific background DNA probabilities
- Account for indirect transfer mechanisms
Persistence Factors Evaluation
- Assess DNA survival probabilities based on time and environmental factors
- Consider substrate characteristics affecting persistence
- Evaluate recovery efficiency probabilities
Sensitivity Analysis
- Test robustness of conclusions to variations in key parameters
- Identify critical assumptions with greatest impact on conclusions
- Determine resolution needed for unknown activity aspects
Likelihood Ratio Calculation
- Compute likelihood ratio comparing probabilities under competing propositions
- Integrate transfer, persistence, and prevalence factors
- Quantify overall strength of evidence for activity-level propositions
Conclusion Formulation
- Interpret likelihood ratio in context of case circumstances
- Clearly communicate limitations and assumptions
- Provide balanced evaluation of probative value

The Scientist's Toolkit: Essential Research Reagents and Materials

Table 3: Essential Materials for Forensic Handwriting and DNA Analysis

Item	Function	Application
High-Resolution Scanner	Digital capture of handwriting specimens	Enables precise measurement of graphic features
Digital Calipers/Measurement Software	Quantitative assessment of handwriting characteristics	Measures letter size, proportions, spacing
Feature Classification Guides	Standardized categorization of graphic elements	Ensures consistent application of qualitative assessments
Statistical Analysis Software	Calculation of similarity scores and likelihood ratios	Supports quantitative evaluation and objectivity
DNA Profiling Kits	Generation of DNA profiles from trace material	Standardized analysis of biological evidence
Transfer Probability Databases	Reference data on DNA transfer mechanisms	Informs activity-level proposition evaluation
Bayesian Network Modeling Tools	Integration of multiple probabilistic factors	Supports complex activity-level evaluations

Visualization of Analytical Frameworks

The relationship between different proposition levels and their evaluation frameworks can be visualized through the following hierarchical structure:

Current ENFSI guidelines and recent research reflect a dynamic evolution in forensic science toward more quantitative, transparent, and proposition-focused methodologies. The distinction between source-level and activity-level propositions provides a critical framework for understanding both the capabilities and limitations of forensic evaluations across different disciplines. While source-level analysis benefits from established statistical approaches and standardized protocols, activity-level evaluation requires integration of more complex factors including transfer mechanisms, persistence characteristics, and background probabilities.

Recent initiatives under ENFSI's FOR FUTURE project demonstrate a clear trajectory toward enhanced digitalization, statistical formalization, and multi-disciplinary integration. The development of structured frameworks for quantitative handwriting examination, coupled with expanded support for likelihood ratio approaches across forensic disciplines, represents significant progress in addressing historical challenges related to subjectivity and reliability. Similarly, focused efforts to generate robust data on transfer and persistence mechanisms for DNA evidence will enable more widespread and defensible evaluation of activity-level propositions.

As forensic science continues to evolve, the interplay between human expertise and computer-assisted tools, between qualitative assessment and quantitative measurement, and between source attribution and activity evaluation will shape future best practices and guidelines. By embracing these developments while maintaining rigorous quality assurance and transparency, forensic science can enhance its contribution to legal processes and more effectively address the complex questions posed by modern criminal investigations.

Conclusion

The transition from source-level to activity-level propositions represents a necessary evolution in forensic science, crucial for providing courts with probative and contextually relevant evidence. This synthesis demonstrates that while methodological frameworks like likelihood ratios, Bayesian networks, and Chain Event Graphs provide robust tools for implementation, overcoming barriers related to data, training, and standardized protocols remains imperative. Future directions must focus on building a community-wide knowledge base on transfer and persistence phenomena, developing operational protocols for contextual sampling, and fostering interdisciplinary collaboration between scientists and legal professionals. By embracing activity-level evaluation, the forensic science community can significantly enhance its contribution to the fair and effective administration of justice.