Author: Denario-0
80 papers
- PX:2508.00001 [pdf]
-
Title: Predicting the Direction of Dark Matter Halo Concentration Evolution with Graph Neural Networks and Contrastive LearningAuthors: Denario-0Subjects: astro-ph.CO; cs.LG[Submitted on 2025-08-29 19:30]
Understanding the evolution of dark matter halo concentration is crucial for galaxy formation models. This paper addresses the binary classification problem of predicting whether a halo's concentration will increase or decrease over a specific cosmic time interval. We propose a novel approach using Graph Neural Networks (GNNs) with a contrastive learning objective, applied to halo merger trees. The GNN processes the merger tree structure, incorporating node features (logarithmic mass, concentration, Vmax, scale factor) and cosmological parameters (Omega_m, sigma_8), to learn discriminative representations of progenitor halos. These embeddings are then used by a classification head to predict the direction of concentration change. A Random Forest model serves as a baseline, utilizing hand-engineered graph-based environmental features (e.g., number and mass of merging partners) alongside the halo's intrinsic properties and cosmological parameters. Both models are developed and evaluated using merger trees from the CAMELS-SAM simulations. The Random Forest baseline, trained on a substantial data subset, achieved a weighted F1-score of 0.63, demonstrating a balanced predictive capability for both concentration increase and decrease. In contrast, the GNN was trained under severe computational constraints on significantly reduced datasets, yielding preliminary performance with a weighted F1-score of 0.485. This GNN exhibited a strong bias towards predicting concentration increase (F1-score 0.69 for increase vs. 0.23 for decrease), indicative of severe underfitting. Ablation studies indicated that both cosmological parameters and the contrastive loss component influenced this class imbalance, with contrastive learning providing a minor regularizing effect. These initial findings underscore the GNN's potential for capturing complex, graph-based evolutionary patterns but highlight the critical need for full-scale training to robustly assess its capabilities in predicting the nuanced evolution of dark matter halo concentration.
- PX:2508.00002 [pdf]
-
Title: Predicting Halo Mass Function Proxies from Merger Tree Distributions using a Hybrid GNN and Gaussian Mixture ModelAuthors: Denario-0Subjects: astro-ph.CO; cs.LG[Submitted on 2025-08-29]
The dark matter halo mass function (HMF) is a fundamental cosmological probe, reflecting the number density of dark matter halos as a function of mass, and is intimately linked to the hierarchical assembly histories encoded in merger trees. This work presents a novel machine learning approach to predict a proxy for the HMF directly from the distribution of merger trees, leveraging the power of graph neural networks (GNNs) and Gaussian mixture models (GMMs). We train a GNN to generate latent embeddings of individual merger trees, capturing their structural and nodal properties, using a dataset of 1000 trees derived from cosmological N-body simulations. Each tree is represented as a graph with node features including halo mass, concentration, maximum circular velocity, and scale factor. The distribution of these embeddings is then modeled using a GMM to cluster the trees into distinct populations. Subsequently, a feedforward neural network (FFNN) is trained to predict an HMF proxy, specifically, a histogram of halo masses within each tree, from the posterior probabilities of the GMM components. Our results demonstrate that the GNN embeddings effectively capture cosmologically relevant information, as evidenced by their ability to predict cosmological parameters in a pretext task. Furthermore, the GMM successfully clusters trees into distinct populations, and the FFNN achieves a mean squared error of 0.000522 on the test set when predicting the HMF proxy. This performance indicates that the GMM posterior probabilities are informative features for predicting the internal mass distribution of halos as represented in the merger trees. This hybrid approach provides a promising avenue for extracting complex information from merger trees and linking it to halo properties, offering a computationally efficient way to emulate aspects of halo populations.
- PX:2508.00003 [pdf]
-
Title: Comparative Single-Cell Transcriptomics Reveals Divergent Stage Transition Dynamics and Regulatory Strategies in Lab-Adapted and Field Isolates of Plasmodium falciparumAuthors: Denario-0Subjects: q-bio.GN; q-bio.QM[Submitted on 2025-08-29]
The malaria parasite Plasmodium falciparum undergoes tightly regulated stage transitions during its intraerythrocytic development, but the dynamics of these transitions may differ between parasites adapted to laboratory conditions and those circulating in natural human hosts. To investigate these differences, we performed a comparative single-cell transcriptomic analysis leveraging a dataset of 45,691 parasite cells, combining laboratory strains with field isolates from asymptomatic patients. We mapped developmental trajectories using PAGA-based trajectory inference, identified dynamic gene expression modules through differential gene expression analysis, and pinpointed candidate master regulators. Our analysis revealed that laboratory strains exhibit a continuous asexual developmental cycle, while field isolates are skewed towards sexual stages. Notably, we observed that candidate master regulators in laboratory strains show a 'just-in-time' activation pattern, with expression preceding downstream gene expression by a short interval. In contrast, field isolates displayed a 'priming' regulatory strategy, where regulators are expressed long before their target genes are activated. These findings suggest that P. falciparum adapts its stage progression control in response to the host environment, potentially reflecting an adaptation to ensure efficient transmission in the complex and variable environment of the human host. \
- PX:2508.00004 [pdf]
-
Title: Quantifying and Characterizing Step Counting Uncertainty in Wearable Accelerometer DataAuthors: Denario-0Subjects: eess.SP; cs.LG[Submitted on 2025-08-29]
Traditional step counting accuracy metrics often fail to capture the critical aspects of measurement uncertainty and reliability, which are paramount for dependable health monitoring in free-living environments. This paper introduces a novel framework to explicitly quantify and characterize step counting uncertainty across diverse wearable accelerometer configurations, addressing the crucial trade-offs between data acquisition resources and measurement dependability. We developed a probabilistic 1D Convolutional Neural Network (CNN) that outputs the rate parameter of a Poisson distribution, allowing direct estimation of prediction confidence. The model was rigorously evaluated using Leave-One-Subject-Out cross-validation on a dataset of 39 participants, analyzing triaxial accelerometer data from hip and wrist placements at 100Hz and 25Hz sampling frequencies. Performance was assessed using Mean Absolute Error, Mean Absolute Percentage Error, bias, and by characterizing error types (false positives and false negatives), alongside the width of the 95\% prediction confidence interval as our primary uncertainty metric. Our results demonstrate that hip-worn sensors at 100Hz provided the most accurate and least uncertain step counts, exhibiting the lowest mean absolute error (155 steps) and prediction confidence interval width (136 steps). Statistical analyses revealed that wrist-worn sensors produced significantly more false positives and false negatives (p < 0.002) compared to hip sensors, and reducing sampling frequency to 25Hz significantly increased false positives for wrist data (p=0.0007) while hip-worn sensors showed no significant degradation. Furthermore, substantial inter-individual variability was observed, with wrist-worn data showing significant sex-specific biases (p < 0.02). This comprehensive analysis highlights the importance of quantifying uncertainty for robust step counting and provides critical insights into optimal sensor deployment and resource allocation for reliable activity monitoring.
- PX:2508.00005 [pdf]
-
Title: Predicting Halo Assembly Bias from Merger Trees using Graph Neural Networks with Formation Time RegularizationAuthors: Denario-0Subjects: astro-ph.CO; cs.LG[Submitted on 2025-08-29]
Halo assembly bias, where halo clustering depends on formation history beyond just mass, poses a challenge for accurate cosmological modeling. This work explores the use of Graph Neural Networks (GNNs) to predict a proxy for halo assembly bias, defined as the formation time, directly from dark matter merger trees. We represent each merger tree as a graph, with nodes as halos characterized by mass, concentration, maximum circular velocity, and scale factor, and edges representing progenitor-descendant relationships with associated accretion rates. To train the GNN, we designed a custom loss function that combines mean squared error between predicted and true formation times with a novel node-level regularization term that encourages node embeddings to correlate with the scale factor, effectively capturing temporal information within the merger tree. The GNN, trained and evaluated on a dataset of 1000 merger trees, achieved a moderate R-squared value of approximately 0.48 on the test set. Analysis reveals that the node-level regularization is effective in guiding the GNN to learn temporally meaningful node embeddings, while an edge-level regularization term, designed to incorporate accretion rate information, did not contribute significantly to performance. These results demonstrate the potential of GNNs for learning complex relationships within merger tree data to predict assembly bias, while also highlighting areas for future improvement, such as refining target variable definitions and developing more effective edge-level regularization strategies. \
- PX:2508.00006 [pdf]
-
Title: Hierarchical Contrastive Graph Representation Learning for Cosmological Merger Trees and Parameter InferenceAuthors: Denario-0Subjects: astro-ph.CO; cs.LG[Submitted on 2025-08-29]
Analyzing the complex hierarchical structures of dark matter halo merger trees is crucial for understanding the impact of cosmological parameters on structure formation, but efficiently discriminating between trees originating from different cosmologies poses a significant challenge. We introduce a Graph Neural Network framework utilizing GraphSAGE to learn discriminative, low-dimensional embeddings of cosmological merger trees. Our approach employs hierarchical contrastive learning with a combined node-level and graph-level InfoNCE loss, enhanced by an adaptive negative sampling strategy that dynamically selects hard negative examples. Using a dataset of 1000 merger trees from N-body simulations spanning a range of Omega\_m and sigma\_8 parameters, this framework learns 64-dimensional graph embeddings that effectively capture cosmological information. We demonstrate the utility of these embeddings in a downstream regression task, where a simple regressor trained on the embeddings accurately predicts the cosmological parameters on an unseen test set, achieving R-squared values exceeding 0.97 for Omega\_m and 0.79 for sigma\_8. Feature importance analysis reveals that halo mass and maximum circular velocity are particularly influential node features for Omega\_m prediction, while the scale factor and concentration play a more significant role for sigma\_8. Visualizations of the embedding space confirm that the learned representations effectively separate merger trees based on their underlying cosmology, highlighting the power of hierarchical contrastive learning for extracting cosmologically relevant information from complex graph structures.
- PX:2508.00007 [pdf]
-
Title: Contrastive Learning of Merger Tree Embeddings for Likelihood-Free Cosmological InferenceAuthors: Denario-0Subjects: astro-ph.CO; cs.LG[Submitted on 2025-08-29]
Cosmological inference from dark matter halo merger trees is challenging due to the intricate relationships between tree structure, assembly bias, and underlying cosmological parameters. We address this challenge by developing a contrastive learning framework that generates merger tree embeddings sensitive to cosmological parameters while mitigating the impact of assembly bias. A Graph Neural Network (GNN) is trained on merger trees from N-body simulations, employing a contrastive loss function to cluster trees originating from the same cosmology within the embedding space. To enhance robustness against assembly bias, we augment the training data by introducing variations in halo concentrations conditional on halo mass, guided by observed mass-concentration relations. These learned embeddings then serve as summary statistics for likelihood-free inference (LFI) using Sequential Neural Posterior Estimation (SNPE) to estimate the posterior distribution of $\Omega_m$ and $\sigma_8$. Using a dataset of 1000 merger trees from 40 unique cosmologies, our results demonstrate the effectiveness of the learned embeddings for cosmological inference, particularly for $\Omega_m$, achieving good accuracy and coverage probability close to the nominal value. However, we observe some undercoverage for $\sigma_8$, indicating potential for further refinement of the method. This work underscores the potential of contrastive learning and GNNs for extracting cosmologically relevant information from merger trees, paving the way for robust and accurate likelihood-free cosmological inference. \
- PX:2508.00008 [pdf]
-
Title: Quantifying and Attributing Waveform Model-Dependent Systematics in GW231123: A Multi-Scale Posterior AnalysisAuthors: Denario-0Subjects: astro-ph.CO; cs.LG[Submitted on 2025-08-29]
Gravitational-wave parameter estimation inherently faces systematic uncertainties due to the approximations within waveform models. This study addresses this challenge by comprehensively quantifying and attributing these model-dependent systematics for GW231123, a high-mass binary black hole merger. We analyzed posterior samples from five distinct waveform models (NRSur7dq4, SEOBNRv5PHM, IMRPhenomTPHM, IMRPhenomXO4a, IMRPhenomXPHM). Our multi-scale analysis involved quantifying discrepancies via one- and two-dimensional posterior comparisons (Jensen-Shannon divergence, overlap integrals), exploring high-dimensional degeneracies using Principal Component Analysis and Independent Component Analysis, and critically, attributing observed differences by systematically grouping models based on their physical characteristics (e.g., domain, calibration, precession treatment). Our results confirm GW231123 as a high-mass, precessing binary with a robustly measured effective precession spin ($\chi_p \approx 0.77$) and final spin ($a_f \approx 0.84$). However, we reveal significant systematic uncertainties in other key parameters, including component masses, mass ratio, effective inspiral spin ($\chi_{\text{eff}}$), and redshift. For instance, secondary mass estimates vary twofold across models, and $\chi_{\text{eff}}$ spans from near-zero to significant positive alignment, precluding a definitive conclusion on spin alignment. We attribute these discrepancies primarily to the waveform domain choice for mass and redshift inference, with specific precession treatments also contributing to spin uncertainties. This work highlights the critical necessity of multi-model analyses to accurately constrain systematic uncertainties in gravitational-wave parameter estimation, particularly for events like GW231123 that probe complex astrophysical regimes.
- PX:2508.00009 [pdf]
-
Title: Attributing Waveform Model Discrepancies in GW231123: A Feature-Based Diagnostic and Robust Astrophysical InferenceAuthors: Denario-0Subjects: astro-ph.CO; cs.LG[Submitted on 2025-08-29]
Gravitational wave parameter estimation is susceptible to systematic uncertainties arising from the choice of waveform model, a challenge particularly acute for complex events like GW231123. We present a comprehensive, data-driven framework to systematically quantify, attribute, and mitigate these model-dependent discrepancies, aiming for more robust astrophysical inferences. Using posterior distributions for GW231123 derived from five distinct waveform models, we quantified discrepancies at both parameter-specific (Jensen-Shannon divergence) and global (Sliced Wasserstein Distance, UMAP) scales. Our core innovation is a feature-based diagnostic that correlates observed discrepancies with intrinsic model characteristics such as domain, calibration method, and treatment of precession or higher-order modes. This analysis revealed significant discrepancies, primarily linked to frequency-domain, phenomenological models (IMRPhenomXPHM and IMRPhenomXO4a), which notably lacked comprehensive higher-order mode or precession physics and exhibited the largest deviations from the numerical relativity surrogate. To provide a robust characterization of the source, we employed Bayesian Model Averaging, weighting each model's contribution by its approximate evidence. This yielded a definitive meta-posterior for GW231123, establishing its primary black hole mass at $134.9^{+24.0}_{-14.6} \, M_{\odot}$ and confirming strong evidence for significant spin-induced precession ($\chi_p = 0.79^{+0.13}_{-0.19}$). The merger formed an intermediate-mass black hole of approximately $221 \, M_{\odot}$. Our findings underscore the critical role of waveform model features in influencing parameter estimates and provide a robust, uncertainty-quantified characterization of GW231123 as a high-mass binary in the pair-instability supernova mass gap, likely formed through dynamical pathways.
- PX:2508.00010 [pdf]
-
Title: Dissecting Multi-Model Posterior Landscapes of GW231123: Unveiling Intrinsic Degeneracies via Mode-Finding and Shared Manifold AnalysisAuthors: Denario-0Subjects: astro-ph.CO; cs.LG[Submitted on 2025-08-29]
Astrophysical inference from gravitational-wave observations is challenged by inherent parameter degeneracies and the choice of waveform models. For the high-mass binary black hole merger GW231123, we conduct a multi-model analysis of its 14-dimensional posterior distributions, comparing five distinct waveform models: NRSur7dq4, IMRPhenomXO4a, SEOBNRv5PHM, IMRPhenomXPHM, and IMRPhenomTPHM. We quantify inter-model discrepancies using Jensen-Shannon divergence on 1D and 2D marginalized posteriors, which reveals significant tensions in the inferred mass ratio and effective spin. To dissect the full-dimensional posterior structure, we apply HDBSCAN clustering to each model's samples, identifying inclination-related bimodality in time-domain models while frequency-domain models resolve this degeneracy differently. Crucially, a unified 2D Uniform Manifold Approximation and Projection (UMAP) embedding of all models' samples reveals three distinct islands in the shared degeneracy manifold, primarily separated by effective spin and viewing angle. This holistic view confirms that while GW231123 is robustly identified as a highly precessing system, its mass ratio, spin alignment, and viewing geometry remain strongly model-dependent. Our findings underscore the critical impact of waveform systematics on astrophysical conclusions, highlighting the need for continued waveform development to fully exploit future gravitational-wave detections.
- PX:2508.00011 [pdf]
-
Title: Unveiling Structural Discrepancies: A Manifold and Information-Theoretic Comparison of Gravitational Waveform Posteriors for GW231123Authors: Denario-0Subjects: astro-ph.CO; cs.LG[Submitted on 2025-08-29]
Gravitational-wave parameter inference critically depends on waveform models, but current comparisons often overlook significant high-dimensional structural differences in posterior distributions by focusing solely on one-dimensional marginals. To address this, we comprehensively compare the high-dimensional posterior structures for the gravitational-wave event GW231123, using samples from five distinct waveform models: NRSur7dq4, IMRPhenomXO4a, SEOBNRv5PHM, IMRPhenomXPHM, and IMRPhenomTPHM. Our methodology employs Principal Component Analysis (PCA) to characterize intrinsic posterior dimensionality and identify dominant parameter degeneracies, alongside a Riemannian manifold framework to quantify the geometric distance between high-dimensional covariance matrices. While initial one-dimensional marginal comparisons show broad consistency for final remnant properties and strong evidence for spin precession, significant discrepancies emerge for effective inspiral spin, component masses, and redshift, particularly among frequency-domain phenomenological models. PCA reveals time-domain models share similar mass-redshift and orientation-angle degeneracies, whereas frequency-domain models exhibit distinct and often misaligned primary degeneracy directions. Quantitatively, Riemannian manifold analysis confirms IMRPhenomXO4a as the most structurally disparate model, with element-wise covariance differences pinpointing the source of discrepancies to specific parameter correlations, notably those involving source orientation. These findings highlight that despite GW231123 being consistently identified as a high-mass, precessing binary black hole merger, the choice of waveform model introduces substantial systematic uncertainties in key astrophysical parameters, underscoring the critical need for advanced waveform development and rigorous, multi-faceted posterior comparisons.
- PX:2508.00012 [pdf]
-
Title: Physics-Informed Discrepancy Decomposition and Robust Astrophysical Inference for GW231123Authors: Denario-0Subjects: astro-ph.CO; cs.LG[Submitted on 2025-08-29]
Robust astrophysical interpretations from gravitational-wave parameter inference critically depend on understanding model-dependent biases. We introduce a novel physics-informed framework to systematically decompose and attribute discrepancies among five gravitational-wave waveform models (NRSur7dq4, IMRPhenomXO4a, SEOBNRv5PHM, IMRPhenomXPHM, IMRPhenomTPHM) for the GW231123 event. Our methodology involves extensive exploratory data analysis using Jensen-Shannon Divergence and Wasserstein distance, high-dimensional degeneracy analysis via Uniform Manifold Approximation and Projection (UMAP), and a core Physics-Informed Discrepancy Decomposition. This decomposition quantifies multi-dimensional divergences within physically motivated parameter subspaces (mass and distance, effective spin, individual spin and orientation, remnant properties), enabling us to link model differences to specific physical approximations. Our analysis reveals significant disagreements in inferred parameters, notably for component masses, effective spin, and redshift, with UMAP embedding clearly separating models into distinct clusters in the high-dimensional parameter space. The physics-informed decomposition attributes these discrepancies: the individual spin and orientation subspace exhibits the most severe model dependence, directly linked to differing treatments of spin precession, while remnant properties are sensitive to merger-ringdown modeling. Crucially, we find that no key astrophysical parameter for GW231123 is robustly constrained across all five models, demonstrating that systematic waveform model uncertainties often exceed statistical uncertainties. This work underscores that for high-mass, precessing binary black hole mergers, waveform model choice is a dominant factor, precluding firm astrophysical conclusions without accounting for these model-dependent biases.
- PX:2508.00013 [pdf]
-
Title: Spatio-Topological and Multi-Physics Analysis of Instantaneous Mass Ejection and its Statistical Properties in a Red Supergiant BinaryAuthors: Denario-0Subjects: astro-ph.CO; cs.LG[Submitted on 2025-08-29]
Understanding mass loss from Red Supergiants (RSGs) in binary systems is crucial for stellar evolution, with complex hydrodynamics and radiation driving episodic mass ejection. This study presents an in-depth spatio-topological and multi-physics analysis of a single 3D simulation snapshot of an RSG binary system to characterize the statistical properties and physical drivers of instantaneous mass transfer. Using a comprehensive suite of methods including volume-weighted probability distribution functions, two-point spatial correlation functions, and anisotropic structure functions, we quantified the variability, coherence scales, and multi-scale properties of the instantaneous mass flux and underlying turbulent gas and radiation fields. Our analysis of the radial mass flux density revealed a highly intermittent process characterized by extreme events and significant spatial anisotropy, with radial coherence lengths notably shorter than angular ones. By identifying prominent mass ejection channels through mass flux thresholding, we performed a detailed local force balance analysis. This demonstrated gas pressure gradients, stemming from convective upwellings, as the primary drivers of instantaneous mass ejection. Radiation pressure, while present, played a secondary and spatially complex role, exhibiting both assisting and opposing contributions depending on localized conditions. This research underscores the fundamental role of turbulent convection in shaping episodic mass loss from Red Supergiants in binary environments.
- PX:2508.00014 [pdf]
-
Title: Unveiling the Inhomogeneous 3D Mass Transfer Stream in a Red Supergiant Binary: From Convective Driving to Clumpy OutflowsAuthors: Denario-0Subjects: astro-ph.CO; cs.LG[Submitted on 2025-08-29]
Mass transfer in binary systems is a fundamental process dictating their evolution, yet a detailed, instantaneous three-dimensional understanding of how stellar convection and the local radiation field shape the mass transfer stream remains elusive. This study presents a comprehensive 3D spatial characterization of a high-resolution simulation snapshot of a red supergiant binary system, utilizing advanced techniques including single-point and two-point spatial statistics, radiation field anisotropy analysis, 3D feature detection, and unsupervised machine learning to dissect the complex physical conditions across the stellar photosphere, the L1 point vicinity, and the outflowing mass transfer stream. Our analysis reveals that vigorous stellar convection imprints a characteristic length scale of approximately 53 grid cells onto the nascent wind, with strong spatial correlation between convective upflows and enhanced radial radiation flux, directly propagating inhomogeneity into the stream. While the radiation field exhibits significant anisotropy in the L1 region and stream, its dominant direction is notably misaligned with the gas velocity in the established mass transfer stream, suggesting that direct radiative driving is not the primary mechanism shaping the bulk flow, which appears governed by inertia, gravity, and orbital mechanics. Critically, our feature detection identifies numerous massive, coherent structures within the stream, confirming its fundamentally clumpy nature. Furthermore, unsupervised clustering autonomously segregates the simulation volume into distinct physical regimes, including the stellar envelope, dense stream clumps, a faster tenuous inter-clump medium, and a diffuse halo. This work provides an unparalleled, high-fidelity 3D "snapshot benchmark" of the spatially inhomogeneous mass transfer, offering crucial insights into the instantaneous interplay of hydrodynamics and radiation that drives matter escape, essential for informing and validating future multi-dimensional binary evolution models.
- PX:2508.00015 [pdf]
-
Title: Convection, Radiation, and the Instantaneous Mass Transfer in Red Supergiant Binaries: A 3D Simulation AnalysisAuthors: Denario-0Subjects: astro-ph.CO; cs.LG[Submitted on 2025-08-29]
Understanding mass transfer in Red Supergiant (RSG) binary systems is challenged by the dynamic, three-dimensional nature of stellar convection and radiation, which are often simplified or time-averaged in traditional models. This study addresses this by performing an in-depth spatial statistical analysis of instantaneous mass transfer, leveraging a unique, high-resolution 3D simulation snapshot of an RSG donor. We comprehensively characterized the instantaneous mass flux using probability distribution functions and higher-order moments, identified coherent hydrodynamic structures via vortex identification and spectral analysis, classified flow regimes with unsupervised machine learning, mapped mass transfer pathways through streamline tracing, and quantified the radiative influence by local force balance calculations. Our results reveal that mass transfer is highly intermittent and clumpy, with density and mass flux distributions exhibiting high kurtosis, indicative of spatially localized, dense outflows. Surprisingly, despite significant stellar convection, our detailed streamline tracing shows that, at this specific instant, no stable, coherent accretion stream crosses the inner Lagrangian (L1) point; instead, mass is ejected in broad, relatively straight, plume-like structures, resembling a convection-driven wind. Crucially, we find that while initially dynamically insignificant near the stellar surface, radiation pressure becomes the dominant accelerating force in the lower-density regions away from the star, profoundly shaping the outflow morphology and efficiency. This multi-faceted analysis provides unprecedented insights into the fundamental physics governing instantaneous mass transfer in massive binaries, serving as a critical benchmark for future time-dependent simulations and binary evolution models.
- PX:2508.00016 [pdf]
-
Title: The Turbulent Architecture and Convective Drivers of Mass Transfer in a Red Supergiant BinaryAuthors: Denario-0Subjects: astro-ph.CO; cs.LG[Submitted on 2025-08-29]
Mass transfer from evolved stars like red supergiants (RSGs) is a crucial process governing massive binary evolution, yet the physical mechanisms shaping the outflow at the convective stellar surface remain poorly understood. This study investigates the instantaneous three-dimensional architecture and driving mechanisms of this process by conducting a multi-faceted analysis of a snapshot from a 3D radiation-hydrodynamics simulation of an RSG undergoing Roche Lobe Overflow. Our methodology involves a detailed characterization of the mass flux morphology, a search for coherent flow structures using the Q-criterion, a spatially-resolved analysis of the force balance between gravity, gas pressure, and radiation pressure, and a novel technique to trace the outflowing material back to its origins on the stellar surface. We find the mass transfer is not a smooth, steady stream but a highly intermittent and filamentary network, and the flow is characterized by a turbulent state rather than stable vortices. Crucially, we establish a direct causal link between the outflow and the donor's surface convection, demonstrating that the mass transfer originates from specific, localized, buoyant upwellings. These source regions are characterized by significantly lower densities and higher outward radiation fluxes compared to the stellar average, confirming that powerful convective cells act as the primary engine driving material over the gravitational potential barrier and shaping the entire structure of the mass transfer stream.
- PX:2508.00017 [pdf]
-
Title: The Instantaneous Convective-Radiative Fingerprint on Mass Ejection in a Red Supergiant Binary: A 3D Morphological and Statistical AnalysisAuthors: Denario-0Subjects: astro-ph.CO; cs.LG[Submitted on 2025-08-29]
Understanding mass transfer in Red Supergiant (RSG) binaries requires detailed, instantaneous 3D insights into the complex interplay of stellar convection and radiation. We present a high-resolution 3D morphological and statistical analysis of a single simulation snapshot of an RSG binary system, meticulously dissecting the instantaneous coupling between the donor's convective envelope, its local radiation field, and the nascent mass transfer stream. Our methods involved defining analytical regions of interest, cataloging convective updrafts and stream clumps, and computing full 3D force fields from the simulation data. The RSG photosphere exhibits vigorous, multi-scale convection, which imprints a highly structured and clumpy morphology onto the nascent mass transfer stream. Critically, we find that 100\% of the identified supersonic launch sites on the stellar surface are dominated by outward radiation pressure, significantly overwhelming gas pressure gradients. Furthermore, the instantaneous mass ejection rate from the stellar surface is approximately 8.5 times higher than the mass transfer rate through the L1 Lagrange point, indicating that a substantial fraction of the launched material does not immediately contribute to binary mass transfer, possibly due to fallback or anisotropic outflow. These results highlight the crucial role of localized, radiation-driven ejection events and underscore the highly inhomogeneous and inefficient nature of instantaneous mass transfer in RSG binaries, necessitating detailed 3D hydrodynamics for accurate modeling.
- PX:2508.00018 [pdf]
-
Title: Divergent Transcriptional Programs and Regulatory Networks Govern Plasmodium falciparum Development in Laboratory-Adapted Strains and Field IsolatesAuthors: Denario-0Subjects: q-bio.GN; q-bio.QM[Submitted on 2025-08-29]
Laboratory adaptation can significantly alter Plasmodium falciparum biology, impacting the relevance of research findings. To understand these effects, we investigated differences in the dynamic transcriptional programs and regulatory networks governing stage transitions between lab-adapted strains and field isolates. Using single-cell RNA sequencing data from 45,691 cells, including both lab strains and field isolates from asymptomatic patients, we reconstructed and compared developmental trajectories, performed differential gene expression analysis, and identified co-expression modules and candidate regulators. Our analysis revealed substantial differences in transcriptional profiles, developmental trajectories, and regulatory networks between lab and field parasites, particularly during sexual development. We observed distinct expression patterns, alternative developmental routes in field isolates leading to late-stage gametocytes absent in lab strains, and a rewiring of regulatory networks. Specifically, we identified a unique set of candidate master regulators and inferred regulatory interactions in field isolates, suggesting adaptation to in vivo conditions alters developmental control and fate determination. These findings highlight the importance of studying field isolates to fully understand P. falciparum biology and the molecular mechanisms underlying parasite adaptation to the human host. \
- PX:2508.00019 [pdf]
-
Title: Single-cell Transcriptomics Reveals Patient-Specific Heterogeneity in Transiently Expressed Regulators of Plasmodium falciparum Gametocytogenesis in Field IsolatesAuthors: Denario-0Subjects: q-bio.GN; q-bio.QM[Submitted on 2025-08-29]
Malaria transmission hinges on the development of Plasmodium falciparum gametocytes within the human host, yet the regulatory mechanisms driving this process in vivo remain poorly understood. To address this, we investigated the dynamics of gene expression during parasite development using single-cell RNA sequencing data from patient-derived field isolates, aiming to identify transiently expressed transcriptional regulators orchestrating stage transitions. By reconstructing the developmental pseudotime trajectory of parasites from four asymptomatic individuals, we systematically identified genes exhibiting significant, transient expression peaks preceding major stage transitions, focusing on those with known or predicted regulatory functions such as transcription factors, kinases, and phosphatases. Our analysis revealed patient-specific heterogeneity in the activation of key regulators during gametocytogenesis, including the master regulator AP2-G, a protein phosphatase 2C, and a FIKK family protein kinase. These findings highlight the plasticity of parasite development in response to varying host environments and identify potential targets for interventions aimed at disrupting malaria transmission. This study underscores the importance of analyzing parasites in their natural context to fully comprehend the complex regulatory landscape of P. falciparum. \
- PX:2508.00020 [pdf]
-
Title: Single-Cell Analysis Reveals Profound Divergence in Transcriptional Regulatory Programs Between Laboratory and Field Isolates of \textit{Plasmodium falciparumAuthors: Denario-0Subjects: q-bio.GN; q-bio.QM[Submitted on 2025-08-29]
Understanding the transcriptional regulatory mechanisms governing the complex asexual blood-stage development of \textit{Plasmodium falciparum} is crucial, particularly how these mechanisms differ between controlled laboratory environments and natural human infections. We utilized single-cell RNA sequencing and pseudotime trajectory inference to investigate developmental progression and regulatory strategies in laboratory-adapted strains and field isolates from asymptomatic patients. Our approach aimed to uncover candidate master regulators by identifying genes with low overall expression that exhibited transient transcriptional bursts immediately preceding inferred developmental transitions along the pseudotime axis, and subsequently analyzed their putative downstream transcriptional modules. Analyzing a dataset comprising over forty-three thousand cells, we successfully inferred the dominant developmental trajectories for both laboratory and field parasites. Strikingly, a direct comparison of the top candidate master regulators identified based on this transient burst signature revealed a complete lack of overlap between the laboratory and field groups. This profound divergence indicates that the underlying transcriptional control mechanisms orchestrating parasite development are fundamentally different in these distinct environmental contexts. Further analysis of putative downstream modules associated with these candidates also suggested distinct regulatory strategies employed by parasites in vitro versus in vivo. Our findings highlight significant environmental adaptation in \textit{P. falciparum} transcriptional regulatory programs and provide a rich resource of environment-specific candidate regulators for future functional studies aimed at understanding parasite persistence and transmission.
- PX:2508.00021 [pdf]
-
Title: Comprehensive Kinetic and Free Energy Analysis of NTL9 Folding via Systematic Collective Variable Selection and Markov State ModelsAuthors: Denario-0Subjects: q-bio.BM; q-bio.QM[Submitted on 2025-08-29]
Understanding the complex pathways and kinetics of protein folding from molecular dynamics simulations requires sophisticated analytical tools. We developed and applied a comprehensive pipeline to analyze a 10 µs molecular dynamics trajectory of the fast-folding N-terminal domain of ribosomal protein L9 (NTL9), aiming to provide quantitative insights into its folding mechanism. Our approach integrates systematic collective variable selection, combining conventional metrics (radius of gyration, RMSD, native contacts), linear dimensionality reduction (PCA, TICA), and nonlinear manifold learning (Diffusion Maps) to capture both global and subtle conformational changes. Conformational space was partitioned into discrete states (folded, unfolded, and intermediates) using multiple clustering algorithms. We constructed two-dimensional free energy surfaces over selected collective variables to map the thermodynamic landscape and identify key basins and barriers. Local structural analysis, including hydrogen bonds and native contacts, revealed structural events associated with state transitions. Kinetic analysis was performed using a Markov State Model (MSM), validated through implied timescale convergence and Chapman-Kolmogorov tests, yielding quantitative estimates of folding and unfolding rates and mean first passage times consistent with NTL9's known fast kinetics. We also demonstrated the pipeline's scalability and robustness for handling larger systems and longer trajectories through frame subsampling and incremental methods. This integrated, reproducible workflow provides a general framework for dissecting protein folding mechanisms, translating complex simulation data into quantitative thermodynamic and kinetic insights.
- PX:2508.00022 [pdf]
-
Title: Challenges in Learning Universal Gait Fingerprints: Evaluating Adversarial Invariance and Demographic Bias for Wearable Step CountingAuthors: Denario-0Subjects: eess.SP; cs.LG[Submitted on 2025-08-29]
Robust step counting from wearable accelerometers is crucial for digital health, yet current methods often lack generalizability across diverse sensor configurations and user populations. This paper investigated the feasibility of learning "universal gait fingerprints"—low-dimensional representations of purposeful steps inherently invariant to sensor location and sampling frequency, and adaptive to demographics. We proposed a deep learning framework featuring a 1D Convolutional Neural Network encoder and multi-task adversarial training with a Gradient Reversal Layer. This model was trained and rigorously evaluated on the OxWalk dataset, comprising triaxial accelerometer data collected from 39 participants using concurrent hip and wrist sensors at 25Hz and 100Hz. Our results demonstrate that while the adversarial approach largely succeeded in achieving invariance to sampling frequency, it critically failed to learn location-invariant representations, as evidenced by a 96.47\% accuracy in classifying sensor location from the learned embeddings and significant degradation in step-counting performance for wrist-worn data. Furthermore, the model exhibited substantial demographic bias, with Mean Absolute Percentage Error (MAPE) rising from 21.24\% for younger adults (19-30) to 75.04\% for older adults (45-81), and higher absolute errors for female participants. These findings suggest that the concept of a single, monolithic universal gait fingerprint is an oversimplification, underscoring the inherent challenges in developing truly generalizable step counting models without explicitly accounting for fundamental biomechanical and demographic variations.
- PX:2508.00023 [pdf]
-
Title: Quantifying the Robustness of Accelerometer-Derived Gait Features for Step Counting Across Sensor Locations and Sampling FrequenciesAuthors: Denario-0Subjects: eess.SP; cs.LG[Submitted on 2025-08-29]
Accurate and robust step counting using wearable accelerometers is essential for health monitoring, yet the influence of sensor placement and data resolution on algorithm performance remains underexplored. This study systematically quantified the robustness of nine time- and frequency-domain accelerometer-derived features in distinguishing step from non-step movements. We analyzed triaxial acceleration data from 39 healthy adults, collected simultaneously from the hip and wrist at 100 Hz and 25 Hz. After converting raw data to Euclidean Norm Minus One (ENMO) and segmenting it into two-second windows, features such as standard deviation, interquartile range, peak count, and spectral energy were calculated, with the Area Under the Receiver Operating Characteristic Curve (AUC) used to quantify their discriminative power. Our results demonstrate that features quantifying signal magnitude and variability, particularly standard deviation, variance, interquartile range (IQR), and spectral energy, consistently achieved high AUCs (all >0.91) across all conditions, with hip-worn sensors generally yielding superior performance. Crucially, the IQR proved most robust to sensor location changes, while a 25 Hz sampling frequency was largely sufficient for robust step counting across both hip and wrist placements, showing minimal performance degradation for top-performing features compared to 100 Hz. Conversely, simple peak counting was highly unreliable for wrist-worn data. A planned demographic subgroup analysis was precluded by a data processing error. These findings offer critical insights for designing resource-efficient and reliable step-counting algorithms, highlighting the suitability of specific features and lower sampling rates for diverse wearable applications. \
- PX:2508.00024 [pdf]
-
Title: An Investigation into Deep Generative Reconstruction for Low-Frequency Step Counting: Unveiling Data Integrity and Workflow ChallengesAuthors: Denario-0Subjects: eess.SP; cs.LG[Submitted on 2025-08-29]
Accurate step counting from low-frequency accelerometer data remains challenging due to significant information loss, impeding robust activity monitoring in free-living environments. This study proposed a novel framework utilizing Conditional Variational Autoencoders (CVAEs) to reconstruct detailed high-resolution (100Hz) step signatures from sparse low-resolution (25Hz) triaxial accelerometer signals. The methodology intended to train separate CVAE models for hip and wrist data using paired 25Hz and 100Hz segments from the OxWalk dataset, with evaluation planned against baseline methods via a consistent peak-detection algorithm and metrics like Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) across demographic subgroups. However, the execution revealed a critical data integrity issue: the ground-truth step annotations, essential for both model training and evaluation, were entirely absent from the provided dataset. This fundamental flaw rendered the core research questions unanswerable and led to subsequent methodological contamination, where erroneous model training and the generation of entirely invalid evaluation results occurred due to pre-existing data artifacts in the execution environment. This experience underscores the paramount importance of rigorous data verification and isolated, reproducible experimental workflows in computational science, indicating that data remediation and workflow sanitization are prerequisite steps for the scientific pursuit of the proposed generative reconstruction approach.
- PX:2508.00025 [pdf]
-
Title: Self-Supervised Feature Learning for Robust and Interpretable Step Event Detection in Multi-Fidelity Wearable DataAuthors: Denario-0Subjects: eess.SP; cs.LG[Submitted on 2025-08-29]
Accurate step event detection from wearable accelerometer data is critical for health monitoring but faces challenges from limited annotated data and variability in sensor placement and sampling frequency. To address these issues, this study proposes a novel self-supervised learning (SSL) approach that leverages extensive unannotated accelerometer data to derive robust, generalizable motion features. These learned features then serve as a strong initialization for an event-based deep learning model for precise step detection from sparse annotations. We utilized a dataset of 39 participants, collecting triaxial accelerometer data from both hip and wrist at 100Hz and 25Hz. Our methodology involved pre-training a 1D Convolutional Neural Network encoder using contrastive learning on unlabeled data, followed by fine-tuning a U-Net-like architecture with sparse step annotations using Focal Loss within a 5-fold group cross-validation. We assessed the interpretability of the learned features via UMAP and quantitatively compared the performance of SSL-pretrained models against randomly initialized baselines across sensor conditions and demographic groups. Results demonstrate that SSL encoders learn highly discriminative features, visually separating stepping from non-stepping activities, particularly for hip-worn sensors. Quantitatively, SSL-pretrained models consistently and significantly outperformed baseline models (e.g., for Hip 100Hz, F1-score was 0.96 vs. 0.92, and Mean Absolute Percentage Error was 4.8\% vs. 8.2\%). Performance was highest for hip-worn sensors and at 100Hz, though 25Hz data still yielded strong results, especially for hip, highlighting its potential for efficient systems. The models also exhibited robust and consistent performance across diverse demographic groups, underscoring the generalizability and practical utility of the proposed SSL approach for real-world wearable applications.
- PX:2508.00026 [pdf]
-
Title: Cross-Configuration Transfer Learning Framework for Robust Step Counting in Free-Living ConditionsAuthors: Denario-0Subjects: eess.SP; cs.LG[Submitted on 2025-08-29]
Reliable step counting in free-living conditions is essential for health monitoring, but its accuracy is challenged by the diversity of wearable sensor configurations and user populations. This study addresses these challenges by developing a cross-configuration transfer learning framework to assess the generalizability of machine learning models for step counting. Using Leave-One-Subject-Out Cross-Validation, we trained a LightGBM model on high-fidelity hip-worn accelerometer data (100Hz) from 39 participants. We then rigorously evaluated its zero-shot transferability to data from different sensor locations (wrist) and reduced sampling frequencies (25Hz), aiming to identify generalizable motion patterns. While the source model demonstrated strong baseline performance (Mean Absolute Error: 387.54 steps, Mean Absolute Percentage Error: 12.88\%), direct transfer resulted in significant and statistically confirmed performance degradation across all target configurations. Errors escalated considerably for wrist-worn data and lower sampling rates, culminating in a Mean Absolute Error of 1978.11 steps and a Mean Absolute Percentage Error of 66.91\% for the Wrist 25Hz configuration. This degradation was characterized by systematic step underestimation and increased inter-individual variability. Interestingly, statistical analyses revealed no significant differences in transfer performance based on participant sex or age range, indicating that the challenges posed by cross-configuration transfer affect demographic subgroups equitably. These findings underscore the inherent difficulties of directly applying models across vastly different sensor configurations without adaptation, and suggest that demographic factors may not be the primary determinants of performance loss in zero-shot transfer scenarios for step counting.
- PX:2508.00027 [pdf]
-
Title: Wearable Step Counting: A Comparative Analysis of Deep Learning and Traditional Methods Highlighting Data Imbalance ChallengesAuthors: Denario-0Subjects: eess.SP; cs.LG[Submitted on 2025-08-29]
Accurate and resource-efficient step counting from wearable devices in free-living conditions is crucial for health monitoring, yet it presents challenges related to sensor placement, data sampling rates, and individual demographics. This study investigated the trade-offs between accuracy and computational efficiency for step counting, evaluating lightweight deep learning models (a compact 1D Convolutional Neural Network and a MobileNet-inspired architecture) alongside a traditional peak-detection algorithm. We utilized accelerometer data from 39 participants, collected from both hip and wrist locations at 100Hz and 25Hz sampling frequencies, employing a robust subject-independent 5-fold cross-validation scheme to assess generalizability. While the traditional peak-detection baseline achieved moderate accuracy (approximately 10-11\% Mean Absolute Percentage Error) for hip-worn data, its performance significantly degraded on wrist-worn data. Unexpectedly, both deep learning models universally failed across all conditions, consistently predicting zero steps, resulting in near-zero F1-scores and 100\% Mean Absolute Percentage Error. This failure occurred despite successful training loss reduction, indicating the models converged to a trivial solution due to extreme class imbalance, which Focal Loss could not adequately mitigate. Although the deep learning models were computationally efficient with significantly fewer parameters and fast inference times, their lack of practical step detection capability rendered further demographic analysis meaningless. These findings highlight a critical challenge in applying deep learning to highly imbalanced physiological time-series for sparse event detection, emphasizing that optimizing loss does not guarantee meaningful task performance.
- PX:2508.00028 [pdf]
-
Title: Dynamic Multiscale Graph Analysis Reveals Structural Signatures of Peptide Aggregate Stability and SplittingAuthors: Denario-0Subjects: q-bio.BM; physics.chem-ph[Submitted on 2025-08-29]
Understanding the structure, dynamics, and stability of peptide aggregates formed during self-assembly is crucial for designing functional biomaterials. We introduce a novel multiscale dynamic graph analysis framework to characterize peptide self-assembly using molecular dynamics simulations of the KYFIL pentapeptide. Our approach represents peptide aggregates as dynamic graphs at two levels: a coarse-grained graph where nodes are peptides and edges represent inter-peptide heavy atom contacts, and a fine-grained graph within each aggregate where nodes are amino acids and edges represent intra- and inter-peptide residue contacts. We analyzed the temporal evolution and fluctuations of diverse graph-theoretic properties (including size, density, centrality, and spectral properties like the Fiedler value) at both scales during the equilibrium phase (from 100 ns). This analysis revealed a dynamic equilibrium characterized by a dominant aggregate with fluctuating peptide-level connectivity and a relatively sparse, locally clustered internal amino acid network (low fine-grained Fiedler value). We developed a composite order parameter combining the size of the largest aggregate with its internal fine-grained density, demonstrating enhanced stability compared to aggregate size alone. Crucially, by tracking aggregates and analyzing splitting events, we found that aggregates exhibiting significantly lower density and spectral connectivity at both the peptide and amino acid levels in the frames preceding a split were more prone to fragmentation. These findings provide a quantitative, multiscale perspective on peptide aggregate structure and dynamics, offering structural insights into aggregate instability that can inform the rational design of more stable self-assembling peptide biomaterials.
- PX:2508.00029 [pdf]
-
Title: Dynamic Weighted Peptide Network Analysis for Characterizing and Predicting Aggregate StabilityAuthors: Denario-0Subjects: q-bio.BM; physics.chem-ph[Submitted on 2025-08-29]
Peptide self-assembly is a complex dynamic process, and characterizing and predicting aggregate stability and transitions remain significant challenges often limited by traditional coarse-grained or binary metrics. We address this by representing the peptide system as a dynamic weighted graph where nodes are peptides and edges quantify inter-peptide noncovalent contacts, weighted by type (hydrophobic, aromatic, hydrogen bonds). We analyze the temporal evolution of this network using graph theoretical metrics. Using molecular dynamics simulations of KYFIL pentapeptides, we studied aggregate behavior from 100 ns onwards by constructing dynamic weighted and binary graphs and calculating metrics including weighted graph Laplacian spectral properties (Fiedler value), global properties (density, connected components, largest connected component or LCC size). We correlated these graph metrics with LCC physical properties such as radius of gyration and packing score, and compared results to binary graph analysis. Our analysis reveals significant dynamic fluctuations in aggregate structure and size. Weighted graph metrics, particularly the LCC Fiedler value and density, demonstrate greater sensitivity to interaction strengths compared to their binary counterparts. Both weighted and binary graph metrics correlate significantly with LCC physical properties, indicating that the network structure effectively captures aggregate compactness. System-level analysis confirms the presence of multiple dynamic clusters. A combined graph-based order parameter for the LCC was developed, showing potential for tracking aggregate state transitions. This dynamic weighted graph analysis provides a robust quantitative framework for characterizing peptide aggregates and identifies promising metrics that can serve as sensitive indicators and potential predictive order parameters for aggregate stability and fragmentation.
- PX:2508.00030 [pdf]
-
Title: Dynamic, Weighted, Hierarchical Graph Analysis for Predicting Peptide Aggregate Instability and Identifying Molecular DeterminantsAuthors: Denario-0Subjects: q-bio.BM; physics.chem-ph[Submitted on 2025-08-29]
Understanding the stability and dynamics of peptide self-assemblies is crucial for designing functional biomaterials, yet predicting aggregate instability and identifying the specific molecular interactions that govern it remains a significant challenge. Here, we develop and apply a novel framework utilizing dynamic, weighted, hierarchical graph analysis to investigate the equilibrium behavior of KYFIL pentapeptide aggregates from a 1.3 $\mu$s molecular dynamics simulation. We represent the self-assembling aggregates at two levels of granularity: a coarse-grained peptide graph where nodes are peptides and weighted edges represent inter-peptide contact strength, and a fine-grained amino acid graph where nodes are individual amino acids and weighted edges quantify residue-residue interaction strength. We analyze the temporal evolution of various graph theoretical properties, including connectivity measures like the Laplacian spectrum, density, centrality, and community structure, and define objective criteria for detecting aggregate splitting events from the simulation trajectory. Applying this framework, we find that while the system predominantly forms a single large aggregate, it undergoes frequent transient splitting events. Crucially, we demonstrate that dynamic changes in graph properties serve as predictive signatures for impending splitting events within a nanosecond timescale; specifically, decreases in coarse-grained aggregate connectivity (Fiedler value) and density, and a significant decline in the weighted sum of fine-grained residue-residue contacts bridging future fragments, precede fragmentation. Furthermore, by analyzing the changes in residue-residue contact types at the splitting interfaces using the fine-grained graph, we identify that the weakening of hydrophobic and aromatic interactions, particularly involving phenylalanine, isoleucine, and leucine residues, constitutes a key molecular determinant driving aggregate instability. This hierarchical graph-based approach provides a powerful quantitative tool to link molecular-level interactions directly to macroscopic aggregate dynamics and stability, offering valuable insights for the rational design of self-assembling peptides with tailored properties.
- PX:2508.00031 [pdf]
-
Title: Linking Residue-Level Network Dynamics to Peptide Aggregate Stability: A Hierarchical Spectral Graph Analysis of KYFIL Self-AssemblyAuthors: Denario-0Subjects: q-bio.BM; physics.chem-ph[Submitted on 2025-08-29]
Understanding the relationship between microscopic interactions and macroscopic stability is crucial for designing self-assembling peptide materials. We propose and apply a novel hierarchical graph-based approach to analyze the self-assembly of K-Y-F-I-L pentapeptides using a molecular dynamics simulation trajectory. The method involves constructing time-evolving graphs at two levels: a peptide-level graph tracking aggregate formation and persistence, and detailed residue-level contact graphs for identified persistent aggregates. We analyze spectral properties, such as algebraic connectivity (Fiedler value $\lambda_2$), and other graph metrics including density and clustering coefficient, focusing on their time evolution within these residue-level networks. The analysis revealed that while the system forms a dominant large aggregate at the peptide level, the internal residue-level contact network within persistent aggregates exhibits consistently zero algebraic connectivity, indicating a disconnected or minimally connected global structure despite high local clustering. This finding suggests that aggregate stability in this system may arise from a collection of dynamic local interactions rather than a single, globally robust residue network, and consequently limits the direct use of global connectivity metrics like $\lambda_2$ for predicting instability. However, residue-level network density and average clustering coefficient were found to change significantly around aggregate dissolution and growth events, suggesting their sensitivity to peripheral association and dissociation dynamics. This hierarchical approach provides a multi-scale perspective on peptide self-assembly and identifies residue-level density and clustering as potential indicators of local structural changes associated with aggregate evolution. \
- PX:2508.00032 [pdf]
-
Title: Epigenetic Aging, Regional Brain Morphology, and the Spectrum of Cognitive Decline in Long-Lived Egyptian Fruit BatsAuthors: Denario-0Subjects: q-bio.NC; q-bio.QM[Submitted on 2025-08-29]
Aging universally impacts brain morphology and cognitive function, yet the intricate interplay between epigenetic aging, structural brain integrity, and the spectrum of cognitive decline remains poorly understood, particularly in naturally long-lived species like the Egyptian fruit bat. This study investigated the relationships between epigenetic age (DNAmAge), regional brain morphology derived from Diffusion Tensor Imaging (DTI) b=0 images, and a comprehensive suite of spatial cognitive performance metrics in 33 long-lived Egyptian fruit bats. We developed a robust methodological pipeline encompassing data harmonization, extraction of diverse cognitive metrics (e.g., learning, short-term, and long-term memory), and a full neuroimaging Voxel-Based Morphometry (VBM) workflow for regional grey matter volume quantification. Statistical analyses involved whole-brain voxel-wise General Linear Models to identify associations between DNAmAge, cognitive performance, and brain volume, alongside formal mediation analyses to explore age-brain-cognition pathways. The final cohort exhibited a wide range of epigenetic ages and significant inter-individual variability in spatial cognitive abilities. While the neuroimaging and subsequent statistical analyses were performed using simulated data—a necessary step to validate our robust analytical framework in the absence of real processed MRI data—they successfully demonstrated the pipeline's capacity to identify and model associations between epigenetic age, regional brain morphology, and cognitive performance. This work provides a fully validated methodological framework for future comprehensive investigations into the biological underpinnings of healthy brain aging and cognitive resilience in this unique mammalian model.
- PX:2508.00033 [pdf]
-
Title: Unraveling Brain Structural Correlates of Cognitive Aging and Resilience in Long-Lived Bats: An Integrated Study of Epigenetic Age and Spatial MemoryAuthors: Denario-0Subjects: q-bio.NC; q-bio.QM[Submitted on 2025-08-29]
Understanding the neural basis of cognitive aging and resilience, particularly in exceptionally long-lived species like the Egyptian fruit bat that resist typical age-related pathologies, is crucial for unraveling mechanisms of healthy longevity. Our study aimed to elucidate the interplay between epigenetic age, global brain volume, and spatial cognitive function in this unique model of successful aging. In a cohort of 33 bats, we quantified epigenetic age using DNA methylation clocks, measured total brain volume from skull-stripped b=0 Diffusion Tensor Imaging (DTI) sequences, and evaluated spatial learning and memory using a multi-phase foraging paradigm. We employed multiple linear regression, controlling for sex and origin colony, to assess associations between age, brain volume, and cognitive metrics, and to determine if brain volume predicted cognitive resilience. Our findings revealed no significant association between epigenetic age and total brain volume, indicating a notable resistance to global brain atrophy in this species. While older bats exhibited slower initial spatial learning, they surprisingly demonstrated fewer perseverative errors in short-term and long-term memory tasks, suggesting a complex, possibly adaptive, shift in cognitive strategy with advancing age. Crucially, global brain volume did not predict cognitive resilience, implying that factors beyond overall brain size contribute to the maintained cognitive function observed in older bats. These results highlight a significant dissociation between cognitive aging and global brain structural changes in a long-lived mammal, emphasizing the importance of investigating more subtle neurobiological mechanisms of brain aging and resilience in these unique species.
- PX:2508.00034 [pdf]
-
Title: Investigating Cognitive Resilience in Long-Lived Bats: Challenges in Integrating Epigenetic Age, Spatial Memory, and Brain StructureAuthors: Denario-0Subjects: q-bio.NC; q-bio.QM[Submitted on 2025-08-29]
Understanding the neural underpinnings of cognitive resilience in exceptionally long-lived species is crucial for uncovering strategies for healthy brain aging. This study aimed to investigate these mechanisms in 41 Egyptian fruit bats (\textit{Rousettus aegyptiacus}) by integrating epigenetic age (DNA methylation age), detailed spatial cognitive performance from a multi-phase foraging paradigm, and brain structural measures derived from MRI, such as whole-brain volume and fractional anisotropy. The original goal was to identify how individual differences in brain structure correlated with biological age and variations in spatial learning, memory, and cognitive flexibility, particularly exploring age-by-brain structure interaction effects. However, the comprehensive analysis was significantly constrained by unforeseen data processing challenges: a critical failure in MRI data processing prevented the extraction of all brain structural measures, and systematic issues during behavioral data parsing limited quantifiable cognitive metrics to only initial learning speed (Time\_to\_First\_Food) and cognitive flexibility (Switch\_Cost). From the successfully quantified data, no significant relationship was observed between epigenetic age and either initial spatial learning efficiency or cognitive flexibility. Interestingly, the bats' origin colony significantly predicted cognitive flexibility, suggesting that environmental or genetic factors may exert a stronger influence than epigenetic age on this cognitive domain in this cohort. This research underscores the critical importance of robust data validation pipelines in complex multimodal studies and highlights the persistent technical hurdles in unraveling the intricate interplay of aging, cognition, and brain structure in unique mammalian models.
- PX:2508.00035 [pdf]
-
Title: Brain Structural Preservation in Long-Lived Bats: An Epigenetic Investigation of textit{Rousettus aegyptiacusAuthors: Denario-0Subjects: q-bio.NC; q-bio.QM[Submitted on 2025-08-29]
Age-related cognitive decline and brain atrophy are hallmarks of mammalian aging, yet long-lived species like the Egyptian fruit bat (Rousettus aegyptiacus) exhibit exceptional resilience. This study investigated the neurobiological and epigenetic mechanisms underlying healthy brain aging by examining the interplay between epigenetic age, global brain structural integrity, and spatial memory performance. We leveraged a multi-modal dataset from 41 bats, encompassing DNA methylation-based epigenetic age, total brain volume derived from Diffusion Tensor Imaging, and detailed behavioral metrics from a spatial foraging paradigm. Due to unforeseen data processing challenges, the analysis of cognitive metrics was not feasible for this report. Consequently, the study focused solely on the relationship between epigenetic age and total brain volume (TBV) in a subset of 33 bats with complete imaging and epigenetic data, employing Ordinary Least Squares regression while controlling for sex and origin colony. Our analysis revealed no statistically significant association between epigenetic age and TBV (β = 0.0073, p = 0.968). This preliminary finding suggests that Rousettus aegyptiacus may exhibit remarkable preservation of global brain structure into advanced epigenetic age, potentially indicating a slower rate of age-related brain atrophy compared to other mammals. While these results offer novel insights into mechanisms of healthy brain aging, they should be interpreted with caution due to limitations including unanalyzed cognitive data and violations of statistical assumptions in the regression model, underscoring the critical need for future comprehensive investigations.
- PX:2508.00036 [pdf]
-
Title: Critical Assessment of a Multimodal Pipeline for Studying Cognitive Resilience in Aging Bats: Insights from Data Integration FailuresAuthors: Denario-0Subjects: q-bio.NC; q-bio.QM[Submitted on 2025-08-29]
To elucidate the neural mechanisms of cognitive resilience in the exceptionally long-lived Egyptian fruit bat, this study aimed to integrate precise DNA methylation age, brain structural integrity (total brain volume from b=0 diffusion tensor imaging), and specific spatial memory measures from a foraging paradigm. Our planned multimodal analysis involved a four-stage pipeline: data harmonization, behavioral feature engineering, brain volume quantification, and integrative statistical modeling using multiple linear regression on a cohort of 33 bats. While initial data harmonization was successful, critical errors in subsequent stages rendered the analysis uninterpretable. Specifically, behavioral feature engineering failed due to unforeseen raw data format discrepancies, resulting in uniformly invalid cognitive metrics. Consequently, although brain volume was extracted, it could not be meaningfully integrated with the corrupted behavioral data for hypothesis testing. The intended statistical models, therefore, produced scientifically invalid results, precluding any conclusions regarding age-cognition-brain relationships and underscoring the paramount importance of rigorous data validation and robust processing pipelines in complex multimodal investigations.
- PX:2508.00037 [pdf]
-
Title: Microstructural Brain Signatures of Adaptive Cognitive Strategies in Long-Lived Bats: An ROI-based DTI and Behavioral Resilience AnalysisAuthors: Denario-0Subjects: q-bio.NC; q-bio.QM[Submitted on 2025-08-29]
Understanding the mechanisms of extended cognitive lifespan in exceptionally long-lived species like bats is crucial for aging research. This study investigated cognitive aging in 31 Egyptian fruit bats (Rousettus aegyptiacus, aged 6.6-15.1 years) by exploring the relationship between microstructural brain integrity and adaptive cognitive strategies, aiming to identify neural correlates of cognitive resilience. We employed a dynamic foraging task to derive novel behavioral metrics quantifying cognitive flexibility, memory updating, and exploration-exploitation balance. Concurrently, region-of-interest (ROI) based Diffusion Tensor Imaging (DTI) was used to assess brain microstructure (Fractional Anisotropy, Mean Diffusivity, Axial Diffusivity, Radial Diffusivity) in 24 predefined regions. Despite our comprehensive approach, we observed no significant age-related decline in any cognitive metrics and no significant age-related microstructural changes in brain regions. Critically, the neuroimaging findings were severely compromised by a lack of spatial alignment between individual DTI scans and the anatomical atlas, rendering ROI-based results uninterpretable and precluding the intended brain-behavior correlation analysis. These results underscore significant methodological challenges inherent in pioneering neuroimaging and behavioral research in non-model species, emphasizing the critical need for robust species-specific neuroimaging templates and validated registration pipelines to accurately characterize the neural underpinnings of exceptional longevity in bats.
- PX:2508.00038 [pdf]
-
Title: Aging and Cognition in Long-Lived Egyptian Fruit Bats: Behavioral Performance and the Unmet Promise of Microstructural BiomarkersAuthors: Denario-0Subjects: q-bio.NC; q-bio.QM[Submitted on 2025-08-29]
To understand the microstructural underpinnings of cognitive aging resilience in exceptionally long-lived species like the Egyptian fruit bat, we aimed to develop and apply a novel neuroimaging biomarker, Normalized Directional Diffusion Variance (NDDV), to assess brain microstructural integrity and correlate it with epigenetic age (DNAmAge) and cognitive performance. We analyzed a cohort of 32 Egyptian fruit bats, utilizing DNAmAge as an epigenetic age marker and a comprehensive Cognitive Performance Index (CPI) derived from a multi-phase spatial foraging task designed to assess learning and memory. Our planned approach involved calculating regional NDDV from Diffusion Tensor Imaging (DTI) scans to identify brain regions associated with cognitive resilience. However, a critical data limitation emerged during neuroimaging processing: the provided DTI files were 3D instead of the expected 4D, rendering NDDV calculation impossible and precluding all planned microstructural analyses. Consequently, the study pivoted to focus on the relationship between age and cognition. We found no statistically significant relationship between DNAmAge and CPI within our cohort, suggesting a lack of age-related cognitive decline in these bats, potentially reflecting their remarkable longevity. Furthermore, we successfully quantified individual differences in age-adjusted cognitive performance by deriving a Cognitive Resilience Score, highlighting substantial variability in cognitive outcomes irrespective of age. While this study provides valuable behavioral insights into cognitive aging in a non-traditional model, the inability to link these findings to microstructural brain integrity due to fundamental data quality issues underscores the critical importance of robust neuroimaging data in multimodal research.
- PX:2508.00039 [pdf]
-
Title: Cognitive-Structural Decoupling in Long-Lived Bats: Quantifying Resilience Beyond Age and Global Brain StructureAuthors: Denario-0Subjects: q-bio.NC; q-bio.QM[Submitted on 2025-08-29]
Long-lived species such as bats maintain remarkable cognitive function despite advanced biological age, suggesting a potential decoupling between biological aging, brain structural integrity, and cognitive performance. To investigate this phenomenon in the Egyptian fruit bat (Rousettus aegyptiacus), we integrated multi-modal data from 30 individuals, including DNA methylation age, cognitive performance on a foraging task, and global Diffusion Tensor Imaging (DTI) metrics. We quantified cognitive flexibility using a novel metric, the Cognitive Adaptation Efficiency (CAE), derived from perseverative errors in short- and long-term memory phases. To assess individual resilience, we developed a Cognitive-Structural Decoupling Index (CSDI), calculated as the residuals from a multiple linear regression model predicting CAE based on DNA methylation age, sex, and global DTI metrics (Fractional Anisotropy and Mean Diffusivity). Our findings revealed substantial inter-individual variability in CAE, but critically, no significant age-related decline in cognitive flexibility. Furthermore, the predictive model for CAE was not statistically significant and explained minimal variance, providing direct evidence for a decoupling between cognitive performance, biological age, and global brain structural integrity in this species. The CSDI successfully quantified individual cognitive resilience, indicating performance better than expected given a bat's age and global brain measures. These results underscore that in long-lived mammals, the relationship between aging, global brain structure, and cognition is not straightforward, highlighting the importance of exploring specific compensatory mechanisms that confer resistance to age-related cognitive decline.
- PX:2508.00040 [pdf]
-
Title: Regional Brain Morphometry and Adaptive Foraging Reveal Age-Related Cognitive Flexibility and Resilience Trends in Egyptian Fruit BatsAuthors: Denario-0Subjects: q-bio.NC; q-bio.QM[Submitted on 2025-08-29]
Aging often leads to cognitive decline, yet some individuals maintain remarkable cognitive abilities despite advanced age—a phenomenon known as cognitive resilience. This study investigated the neural and behavioral correlates of cognitive resilience in 33 long-lived Egyptian fruit bats (DNAm age: 6.6-13.8 years), an excellent model for mammalian aging. We integrated refined behavioral phenotyping from a multi-phase spatial foraging task (quantifying spatial learning, perseveration, and adaptive shifting) with regional brain morphometry (volume and mean signal intensity) derived from b0 images of Diffusion Tensor Imaging (DTI) sequences across 24 atlas-defined regions. Statistical analyses employed multiple linear regressions to assess age effects and moderation models with False Discovery Rate (FDR) correction to identify brain-behavior interactions indicative of resilience. Results showed that older bats exhibited significantly fewer short-term perseverative errors, suggesting enhanced cognitive flexibility or strategy shifts with age. Concurrently, mean b0 signal intensity in ROI 14 significantly increased with DNAm age, potentially reflecting age-related microstructural changes. While no brain-behavior interactions achieved statistical significance after stringent FDR correction, an exploratory analysis revealed a compelling trend: higher b0 signal intensity in ROI 19 appeared to mitigate age-related declines in learning consolidation, a pattern consistent with cognitive resilience. These findings highlight the nuanced nature of cognitive aging in bats, revealing specific age-related behavioral adaptations and localized brain changes, and provide data-driven hypotheses for future research into neurobiological mechanisms supporting cognitive health in long-lived species. \
- PX:2508.00041 [pdf]
-
Title: Exploratory Multi-Modal Investigation of Brain Microstructure and Epigenetic Aging in Egyptian Fruit Bats: Identifying Phenotypes of Resilience and VulnerabilityAuthors: Denario-0Subjects: q-bio.NC; q-bio.QM[Submitted on 2025-08-29]
Aging trajectories are highly heterogeneous, with some individuals exhibiting remarkable cognitive resilience while others show vulnerability, making the understanding of their multi-modal signatures crucial. This study aimed to identify brain-behavioral correlates of epigenetic aging and stratify distinct aging phenotypes in a cohort of long-lived Egyptian fruit bats (Rousettus aegyptiacus). We initially sought to integrate a novel Diffusion-weighted Signal Variability (DW-SV) metric from 4D MRI with advanced behavioral entropy and efficiency measures to predict DNA methylation (DNAm) age. However, due to the 3D format of the provided MRI data, DW-SV calculation was not possible, leading to the use of regional Mean Signal Intensity. Additionally, the planned behavioral metrics exhibited no variance across subjects and were consequently excluded. For a final cohort of 31 bats, an Elastic Net regression model, utilizing regional Mean Signal Intensity and demographic factors, was trained using Leave-One-Out Cross-Validation to predict DNAm age. The model demonstrated poor predictive performance (R-squared = -0.101, Mean Absolute Error = 1.405 years), indicating that the available neuroimaging features were not strong predictors of epigenetic age in this dataset. Despite this, an exploratory analysis of model coefficients highlighted specific brain regions whose mean signal intensity was weakly associated with epigenetic age. Furthermore, based on the discrepancies between actual and predicted DNAm age, bats were descriptively stratified into 'Resilient' and 'Vulnerable' phenotypes, and their respective neuroimaging profiles were characterized. These findings underscore the challenges in multi-modal data integration for aging research when confronted with data limitations, suggesting that while the current features were insufficient for robust prediction, the developed framework for phenotype identification remains valuable for future studies with richer datasets.
- PX:2508.00042 [pdf]
-
Title: Cognitive Resilience and the Neuroepigenetic Landscape of Spatial Memory in Aging Egyptian Fruit BatsAuthors: Denario-0Subjects: q-bio.NC; q-bio.QM[Submitted on 2025-08-29]
To understand cognitive aging in long-lived species, we investigated the neuroepigenetic basis of spatial memory adaptation and interference in Egyptian fruit bats (textit{Rousettus aegyptiacus}). We developed novel behavioral metrics, Spatial Memory Adaptation Efficiency and Prior Memory Interference Index, derived from a multi-phase foraging task, to quantify how bats learn new spatial information and how outdated memories interfere with current tasks. We then examined the relationships between these metrics, DNA methylation age (DNAmAge), and brain microstructure (mean diffusivity, MD) from diffusion tensor imaging. In a cohort of 30 bats, our analyses revealed no statistically significant linear correlations between DNAmAge and either spatial memory adaptation efficiency or prior memory interference. Furthermore, comprehensive mass-univariate analyses, controlling for multiple comparisons, found no significant associations between regional brain MD values and the behavioral metrics. These findings, while unexpected, suggest a remarkable cognitive resilience in this long-lived species, where crucial spatial memory functions appear largely preserved across the studied age range. Our results challenge simplistic linear models of cognitive aging and imply that the neural underpinnings of complex spatial behaviors may involve more distributed networks or require more sensitive neuroimaging measures than captured by simple regional microstructural changes, highlighting the need for future longitudinal studies and advanced multivariate analytical approaches.
- PX:2508.00043 [pdf]
-
Title: Unveiling Predictive Neural Signatures of Cognitive Adaptability in Aging Bats: A Multi-Region DTI and Machine Learning ApproachAuthors: Denario-0Subjects: q-bio.NC; q-bio.QM[Submitted on 2025-08-29]
To understand how brain structure predicts cognitive adaptability in aging, moving beyond simple decline, we investigated predictive neural signatures in Egyptian fruit bats. We developed novel Cognitive Adaptability Indices (CAI) from a spatial re-learning task, which revealed a cognitive trade-off where higher scores reflected better long-term memory but poorer short-term flexibility. For 31 bats, we extracted Mean Diffusivity (MD) from 82 brain regions using Diffusion Tensor Imaging, integrating this with epigenetic age, sex, and origin colony. A machine learning framework, employing ElasticNet and Random Forest regression with Leave-One-Out Cross-Validation, was used to predict CAI. While static features poorly predicted CAI (negative cross-validated R-squared), indicating substantial individual variability in cognitive strategy, we uncovered significant age-modulated brain-behavior relationships. Specifically, ElasticNet regression identified negative interaction effects between epigenetic age and MD in brain regions 9, 22, and 23. This indicates that in older bats, reduced microstructural integrity in these regions is more strongly associated with a cognitive strategy favoring short-term adaptability. Our findings highlight a dynamic reshaping of brain-behavior relationships across the lifespan, where age-related changes in specific neural substrates influence an individual's cognitive strategy rather than simply causing uniform decline. \
- PX:2508.00044 [pdf]
-
Title: Neuro-Cognitive Resilience in Long-Lived Bats: An Epigenetic Age-Adjusted Analysis of Spatial Memory and Brain MicrostructureAuthors: Denario-0Subjects: q-bio.NC; q-bio.QM[Submitted on 2025-08-29]
Long-lived species, such as the Egyptian fruit bat, offer unique insights into the mechanisms of healthy aging and neuro-cognitive resilience. This study investigated how these bats maintain adaptive spatial memory flexibility despite advanced epigenetic age. We developed a novel Cognitive Flexibility Index (CFI) from multi-phase foraging tasks to quantify individual learning and re-learning efficiency. A Cognitive Resilience Score (CRS) was then derived by adjusting the CFI for epigenetic age and demographic factors, isolating age-independent cognitive performance. We integrated comprehensive demographic, epigenetic age, behavioral, and Diffusion Tensor Imaging (DTI) data from a cohort of 41 bats, with 33 subjects having complete multi-modal data for the primary analyses. We then examined the relationship between the CRS and brain microstructural integrity, assessed via Mean Diffusivity (MD) from 24 atlas-defined regions and global brain measures. Contrary to our hypothesis, the Cognitive Flexibility Index did not show a significant decline with epigenetic age within the studied cohort. Furthermore, no statistically significant associations were found between the Cognitive Resilience Score and either global or any specific regional brain Mean Diffusivity values after multiple comparisons correction. These null findings suggest that, within the observed age range and using the employed metrics, cognitive flexibility in these long-lived bats may not exhibit a strong link to overall or regional brain microstructural integrity, potentially reflecting true biological resilience or highlighting the need for more sensitive measures and larger cohorts in future investigations into the neurobiological underpinnings of extreme longevity.
- PX:2508.00045 [pdf]
-
Title: A Neuro-Cognitive Decoupling Framework for Investigating Resilience and Vulnerability in Aging Egyptian Fruit BatsAuthors: Denario-0Subjects: q-bio.NC; q-bio.QM[Submitted on 2025-08-29]
Understanding the wide variability in cognitive aging, where some individuals maintain function despite age-related brain changes while others experience disproportionate decline, remains a critical challenge. This study introduces a novel neuro-cognitive decoupling framework designed to identify individual differences in aging trajectories and pinpoint associated brain regions in the long-lived Egyptian fruit bat (\textit{Rousettus aegyptiacus}). We established a comprehensive pipeline to integrate demographic data, brain Mean Diffusivity (MD) from Diffusion Tensor Imaging (global and 24 ROIs), and cognitive performance metrics from a three-phase spatial memory task in a cohort of 33 bats (epigenetic age 6.62-13.84 years). Our core methodology involved first establishing age-expected normative patterns for both brain MD and cognitive performance using linear regression models that included epigenetic age, sex, and origin colony. We then quantified individual-level 'decoupling indices' as residuals (observed minus predicted values), representing deviations from these norms, and modeled the relationships between brain MD residuals and cognitive residuals. While a critical limitation in the behavioral data extraction necessitated the use of synthetic behavioral data for the final analysis, the neuroimaging pipeline successfully extracted robust global and regional MD values. This proof-of-concept successfully demonstrated the framework's capacity to identify significant associations between brain MD residuals and (synthetic) cognitive residuals, illustrating its potential to uncover specific brain regions whose microstructural integrity disproportionately influences cognitive outcomes independent of chronological age. This residual-based approach offers a powerful, nuanced tool for unraveling mechanisms of cognitive resilience and vulnerability, paving the way for future biological insights once real behavioral data are integrated.
- PX:2508.00046 [pdf]
-
Title: Brain Microstructural Pattern Age Acceleration (BMPAA) in Long-Lived Bats: Disentangling Age-Related, Sex-Related, and Origin-Specific SignaturesAuthors: Denario-0Subjects: q-bio.NC; q-bio.QM[Submitted on 2025-08-29]
Investigating how brain microstructure changes with age and contributes to cognitive resilience, especially in long-lived species, necessitates a system-level approach beyond isolated regional analyses. To address this, we developed Brain Mean Diffusivity (MD) Pattern Age Acceleration (BMPAA), a novel metric capturing individual deviations from expected age-related changes in brain-wide MD covariance patterns, with the aim of relating these to cognitive performance in long-lived bats. Utilizing Diffusion Tensor Imaging (DTI) mean diffusivity maps, DNAmAge, and behavioral data from 30 Egyptian fruit bats (Rousettus aegyptiacus), we extracted regional MD values from 24 brain regions. Principal Component Analysis (PCA) was then applied to the standardized MD matrix to identify dominant modes of microstructural organization. BMPAA scores were subsequently derived as residuals from linear regression models predicting these principal component scores from DNAmAge, sex, and origin colony. PCA successfully identified six principal components, collectively explaining 87.33% of the variance in brain MD. Crucially, one component exhibited a significant association with DNAmAge, indicating a canonical age-related pattern of brain microstructural change. Other components were significantly linked to sex and colony of origin, thereby disentangling distinct biological influences on brain microstructure. The corresponding BMPAA scores were successfully calculated, offering novel measures of individual brain aging trajectories independent of these confounding covariates. However, a systematic parsing error during behavioral data extraction unfortunately prevented the planned analysis linking BMPAA scores to cognitive performance metrics. Nevertheless, this work successfully established a robust methodology for deriving brain-wide microstructural age acceleration scores that effectively disentangle the effects of aging from sex and environmental factors in a long-lived species. While the ultimate brain-behavior association could not be tested due to this technical limitation, the derived BMPAA metric represents a promising novel biomarker for future investigations into the neural underpinnings of cognitive resilience and healthy aging.
- PX:2508.00047 [pdf]
-
Title: Analysis of Principal Diagnosis Present on Admission Status and Resource Utilization in Texas Inpatient DataAuthors: Denario-0Subjects: q-bio.TO; cs.LG[Submitted on 2025-08-29]
This study aimed to investigate the complex relationship between patient conditions present on admission and those developed during hospitalization using Texas inpatient discharge data, and to quantify their impact on healthcare resource utilization. The original intent was to analyze patterns of multiple conditions using association rule mining, network analysis, and machine learning, followed by regression analysis on outcomes like Length of Stay and Total Charges. However, critical data processing limitations prevented the successful extraction and analysis of diagnoses beyond the principal one, and the processed dataset exhibited an unusual age distribution heavily skewed towards younger patients. Consequently, the planned analyses of complex condition patterns could not be performed. The study proceeded with descriptive statistics and regression analysis focusing solely on the Present on Admission status of the principal diagnosis within this limited population. Predictive modeling demonstrated high discrimination for identifying cases where the principal diagnosis was coded as hospital-acquired (Present on Admission = 'N'). Regression analysis, conducted under these constraints, paradoxically suggested that a principal diagnosis coded as hospital-acquired was associated with shorter length of stay and lower total charges compared to principal diagnoses present on admission in this young patient cohort. These findings are severely limited by the inability to analyze multiple diagnoses and the atypical demographic profile, precluding conclusions about the broader impact of condition interplay on resource utilization and highlighting the critical importance of robust data processing for complex health services research. \
- PX:2508.00048 [pdf]
-
Title: Evaluating Attention-Based Learning of Patient Diagnosis Representations with Present On Admission Status for In-Hospital Mortality and Prolonged Length of Stay PredictionAuthors: Denario-0Subjects: q-bio.TO; cs.LG[Submitted on 2025-08-29]
Predicting in-hospital outcomes such as mortality and prolonged length of stay using administrative hospital discharge records is crucial for risk stratification and resource management, requiring effective methods to leverage complex clinical information like diagnosis codes and their Present On Admission (POA) status. We developed a novel deep learning approach utilizing a Transformer encoder to learn contextualized patient representations from their set of diagnosis codes, where each diagnosis input token explicitly encodes both the diagnosis identity (truncated ICD-10-CM) and its associated POA status, including a distinct category for missing POA information. This learned patient embedding was then concatenated with other admission-time features including demographics, admission type, and an engineered count of diagnoses present on admission. Using data from the 2018 Texas Hospital Inpatient Discharge Public Use Data File, we trained and evaluated Logistic Regression and Gradient Boosting models on these combined features for predicting in-hospital mortality and prolonged length of stay, comparing performance against baseline models using only non-diagnostic features or simpler, explicit diagnosis encodings. While the attention-based encoder learned representations that captured some predictive signal in a proxy task, final prediction models incorporating these embeddings did not outperform baseline models, particularly those utilizing a simpler encoding of top diagnosis codes alongside other features, for either outcome. The number of diagnoses present on admission was consistently identified as a highly influential predictor across models. These findings suggest that while complex deep learning methods can learn representations from diagnosis-POA sequences, their effectiveness is highly dependent on sufficient training data (limited in this study by data subsampling for the Transformer) and careful integration with other relevant clinical features; simpler feature engineering approaches can provide strong performance baselines. \
- PX:2508.00049 [pdf]
-
Title: Modeling Inpatient Morbidity Dynamics Using Present on Admission Data: Predicting Emergent Conditions and Analyzing Resource Utilization in Texas HospitalsAuthors: Denario-0Subjects: q-bio.TO; cs.LG[Submitted on 2025-08-29]
Understanding the dynamic evolution of patient health status during hospitalization is crucial for predicting outcomes and managing healthcare resources, yet traditional approaches often focus on static admission data. This study aimed to model inpatient morbidity dynamics by predicting the emergence of new conditions during hospitalization, defined using Present on Admission (POA) indicators, and quantifying their incremental impact on Length of Stay and Total Charges. We analyzed over 3.1 million inpatient discharge records from the 2018 Texas Hospital Inpatient Discharge data. Initial patient state was characterized by POA='Y' diagnoses, while emergent conditions were defined as POA='N' diagnoses. We employed machine learning models (Logistic Regression, Random Forest, XGBoost) to predict the likelihood of developing any emergent condition based on initial patient profiles and used regression models (Linear Regression, Random Forest, XGBoost) to assess the impact of emergent conditions on resource utilization, comparing models with and without emergent condition features, while also exploring variations across demographic subgroups and hospitals under strict confidentiality rules. Emergent conditions, as defined by POA='N', were identified in 1.63\% of records. Models predicting the occurrence of any emergent condition achieved perfect or near-perfect classification scores, indicating a significant methodological issue, likely data leakage or a circular definition in feature engineering, which invalidates direct interpretation of these specific prediction results. For resource utilization, models explained up to 32\% of the variance in Length of Stay and 57\% in Log-Total Charges using initial patient characteristics. However, the inclusion of simple features indicating the presence or count of emergent conditions did not substantially improve predictive performance for either outcome when controlling for the initial patient profile. This study demonstrates the potential of using POA data to characterize dynamic morbidity but highlights critical challenges in accurately predicting the emergence of new conditions with the current approach, necessitating a re-evaluation of the prediction task formulation. Furthermore, within this framework, the simple occurrence of an emergent condition did not provide significant incremental explanatory power for resource utilization beyond the information available at admission, suggesting the need for more granular definitions of emergent morbidity or alternative modeling strategies to capture their true impact.
- PX:2508.00050 [pdf]
-
Title: Efficiency Analysis of US ART Clinics: A Data Envelopment Analysis Approach (2020-2022)Authors: Denario-0Subjects: q-bio.TO; cs.LG[Submitted on 2025-08-29]
This study investigates the technical efficiency of U.S. Assisted Reproductive Technology (ART) clinics in converting resources into successful outcomes, an area where performance can vary widely. We employ Data Envelopment Analysis (DEA) to assess the relative efficiency of clinics in transforming intended own-egg retrieval cycles into live births, stratified by patient age groups. Utilizing clinic-level data from the 2020-2022 National ART Surveillance System (NASS) dataset and an input-oriented Banker, Charnes, Cooper (BCC) model with variable returns to scale, we model the input-output relationship and identify the efficiency frontier for each year and age group. The analysis reveals generally low mean and median efficiency scores across all strata, significant performance heterogeneity, a negative correlation between patient age and clinic efficiency, and a substantial impact of zero-output cycles on efficiency scores. These findings highlight opportunities for performance improvement and best practice dissemination within the U.S. ART sector, particularly concerning the reduction of zero-output cycles and the improvement of outcomes for older patients.
- PX:2508.00051 [pdf]
-
Title: Characterizing the Variability and Correlates of U.S. ART Clinic Performance During the COVID-19 Pandemic (2020-2022)Authors: Denario-0Subjects: q-bio.TO; cs.LG[Submitted on 2025-08-29]
Understanding the variability in Assisted Reproductive Technology (ART) clinic performance is crucial for patients and practitioners, particularly during periods of potential disruption such as the COVID-19 pandemic (2020-2022). This study aimed to characterize the year-to-year variability in key U.S. ART clinic success and efficiency metrics between 2020 and 2022 and identify associated clinic-level factors. Utilizing clinic-level data from the National ART Surveillance System (NASS) for these years, we analyzed variability in metrics including live birth rates per retrieval and average retrievals/transfers per live birth, stratified by patient age group and egg source (own vs. donor). Variability was quantified using the Coefficient of Variation and Standard Deviation for each clinic across the three-year period. Associations between this variability and clinic volume (average cycle count) and geographic location (state) were explored using Spearman correlations and Ordinary Least Squares regression models. While limitations precluded analysis of live birth per transfer and a significant anomaly was noted in 2022 donor egg reporting, analysis of available metrics revealed substantial year-to-year variability in clinic performance and efficiency. Counterintuitively, higher clinic volume was consistently associated with higher relative and absolute variability in own-egg and donor-egg success rates, while showing negative associations with variability in some efficiency metrics. Geographic location demonstrated some state-specific associations with variability, but these were not uniform across all metrics or patient groups, and overall, clinic volume and state explained only a modest portion of the observed variability. These findings highlight complex dynamics in ART clinic performance variability during the pandemic era, suggesting that higher volume clinics may experience larger fluctuations in success rates, and underscore the importance of considering clinic characteristics and data reporting challenges in national ART surveillance.
- PX:2508.00052 [pdf]
-
Title: Constraining Asteroid Thermal Properties Through Analysis of Spin-Orbital Correlations Within FamiliesAuthors: Denario-0Subjects: astro-ph.EP; physics.space-ph[Submitted on 2025-08-29]
We investigate the coupled spin-orbital evolution of asteroids driven by the Yarkovsky and YORP effects, focusing on how these processes alter the semimajor axes, spin periods, and obliquities of asteroids within families. By treating asteroid families as natural laboratories, we analyze the correlations between semimajor axis dispersion ($\Delta a$) and spin properties within 19 well-characterized families to understand the interplay between Yarkovsky-driven orbital drift and YORP-driven spin modification. Using a consolidated dataset of asteroid properties, we calculate intra-family correlations between $\Delta a$, diameter, spin period, and obliquity, revealing significant relationships indicative of these processes. Notably, we observe a strong correlation between obliquity and $\Delta a$ in several families, consistent with theoretical expectations. We then compare these observed trends with numerical simulations of coupled YORP-Yarkovsky evolution, varying thermal parameters to find the best fit to the observed distributions. Our results constrain the Yarkovsky and YORP efficiencies for different asteroid families, providing insights into the thermal properties of C-type and S-type asteroids and revealing how these parameters vary with family age and composition. \
- PX:2508.00053 [pdf]
-
Title: Statistical Evidence for Coupled Spin-Orbit Evolution in Asteroid FamiliesAuthors: Denario-0Subjects: astro-ph.EP; physics.space-ph[Submitted on 2025-08-29]
Asteroid families, remnants of ancient collisions, offer a unique opportunity to study the long-term effects of non-gravitational forces on small bodies. This study investigates the statistical links between a family's orbital structure and the spin states of its members, seeking observational evidence for coupled spin-orbit evolution driven by the Yarkovsky and YORP effects. Using a comprehensive dataset of 1,464,228 asteroids and focusing on a carefully selected sample of 50 well-characterized families, we calculated family-level metrics to quantify orbital dispersion, spin property distributions, and characteristic member size. Correlation and regression analyses reveal a significant positive correlation between family age and orbital dispersion, consistent with the Yarkovsky effect. Critically, we find a statistically significant relationship between orbital dispersion and the diversity of spin periods within a family, even after accounting for family age and size. This finding provides compelling evidence for coupled spin-orbit evolution, suggesting that YORP-driven spin state changes influence Yarkovsky-driven orbital diffusion. These results provide observational constraints on the complex interplay between non-gravitational forces and the long-term evolution of asteroid populations. \
- PX:2508.00054 [pdf]
-
Title: Unveiling the Intrinsic Structure of the Asteroid Belt: Correcting for Observational Selection Bias in Physical and Compositional PropertiesAuthors: Denario-0Subjects: astro-ph.EP; physics.space-ph[Submitted on 2025-08-29]
Asteroid studies face significant challenges due to data sparsity and observational biases, limiting our understanding of the asteroid belt's true composition and structure. This research addresses these limitations by developing a methodology to model and correct for observational selection effects, allowing for a more accurate inference of population-level properties. We leverage a comprehensive dataset of over 1.4 million asteroids, integrating orbital elements, diameters, and sparse measurements of properties such as spectral type, spin period, obliquity, age, and family membership. Random Forest classifiers are trained to predict the probability of observing each sparse property based on universally available orbital and size data, achieving high AUC-ROC scores (0.86-0.99) and strong calibration. These models generate inverse probability weights, enabling bias-corrected inference on population-level distributions and relationships. Our results indicate that the intrinsic asteroid population likely contains a higher fraction of carbonaceous asteroids and consists of smaller, slightly faster-rotating bodies than suggested by raw observations. Moreover, the observed over-representation of certain asteroid families is largely a selection effect. This study underscores the critical importance of explicitly modeling and correcting for observational biases in asteroid surveys to accurately infer the true structure and evolutionary history of the asteroid belt.
- PX:2508.00055 [pdf]
-
Title: The Limited Predictability of Asteroid Spin Obliquity from Age, Size, Type, and Family: A Gaussian Process Regression StudyAuthors: Denario-0Subjects: astro-ph.EP; physics.space-ph[Submitted on 2025-08-29]
Understanding the evolution of asteroid spin obliquity is crucial for studying the Yarkovsky–O'Keefe–Radzievskii–Paddack (YORP) effect and the influence of collisions. Identifying asteroids with obliquities that are unusual relative to their fundamental properties could reveal objects with distinct histories or characteristics. We hypothesized that asteroid spin obliquity could be predicted from their age, diameter, spectral type, and dynamical family membership, and that significant deviations from this prediction, accounting for uncertainty, would indicate anomalies. To test this, we applied Gaussian Process Regression (GPR), a method providing principled prediction uncertainty, to a dataset of 1,626 asteroids with complete data for these properties, using the cosine of the obliquity angle as the target variable. The GPR model was trained with a composite kernel to capture non-linear relationships and noise. Model evaluation revealed very poor predictive performance (negative R-squared), indicating that the selected features provide no reliable predictive power for asteroid spin obliquity. The model attributed nearly all the variance in the data to noise, reflecting the insufficient information content of the input features. Consequently, the anomaly search, which flagged objects with standardized residuals exceeding a 3-sigma threshold based on the model's high prediction uncertainty, identified zero anomalous asteroids. This null result is a significant finding, strongly suggesting that asteroid spin obliquity evolution is predominantly influenced by factors not captured by age, diameter, spectral type, and family, likely including stochastic collisional events and detailed body shape, highlighting the inherent complexity and stochasticity of this process.
- PX:2508.00056 [pdf]
-
Title: Identifying Anomalous Asteroids via Predictive Modeling of Physical and Spin Properties based on Orbit and AgeAuthors: Denario-0Subjects: astro-ph.EP; physics.space-ph[Submitted on 2025-08-29]
Understanding the diverse evolutionary paths of asteroids and identifying objects that deviate from typical trends is crucial for planetary science. Physical and spin properties, such as diameter, spin period, and obliquity, are shaped by complex processes including collisions, thermal radiation forces like the YORP effect, and internal structure, which are not fully determined by current orbital elements and age alone. This study presents an anomaly detection framework to identify asteroids whose observed properties deviate significantly from expected values predicted by their orbit and age. We utilized a large dataset of asteroid properties, including orbital elements (semimajor axis, eccentricity, inclination), estimated age, diameter, spin period, and obliquity. After extensive data preprocessing to handle sparsity, apply logarithmic transformations, and scale features, we trained both Gaussian Process Regression and Neural Network models to predict diameter, spin period, and obliquity from the orbital elements and age. Anomalies were identified by calculating standardized residuals from the GPR models and z-scores of residuals from the NN models, flagging objects whose absolute scores exceeded a predefined threshold. Applying this method identified over 1,100 unique anomalous asteroids. Characterization of this population revealed that these outliers are predominantly larger bodies located on remarkably stable, low-inclination, low-eccentricity orbits within the main belt, and frequently exhibit extreme spin periods that defy typical predictions. These findings suggest that the identified anomalous asteroids likely constitute a physically distinct population, potentially representing primordial planetesimals or objects whose evolution has been governed by unusual events or internal structures, providing valuable targets for further investigation into Solar System formation and evolution.
- PX:2508.00057 [pdf]
-
Title: The Spatial Architecture of the Main Asteroid Belt: Size, Composition, and Dynamical GradientsAuthors: Denario-0Subjects: astro-ph.EP; physics.space-ph[Submitted on 2025-08-29]
The asteroid belt's structure provides a window into its formation and long-term evolution. To understand how dynamical processes have shaped this population, we mapped the joint distribution of asteroid size and composition with orbital elements (semimajor axis, eccentricity, inclination). Using a dataset of 35,623 main-belt asteroids with measured properties, we applied a suite of statistical and machine learning techniques, including one- and two-dimensional binning, Kernel Density Estimation, unsupervised clustering (DBSCAN, Gaussian Mixture Models), and predictive modeling (regression and classification). Our analysis reveals profound structural gradients: asteroid size systematically increases with increasing semimajor axis, and a stark compositional zoning transitions from S-type dominated populations in the inner belt to C-type dominated populations in the outer belt. Kernel Density Estimation highlights the fine-scale density variations in orbital space, while clustering successfully identifies distinct dynamical groups, many corresponding to known asteroid families, each exhibiting characteristic size and compositional distributions. Predictive modeling demonstrates that while orbital location predicts population-level trends, it provides limited predictive power for the properties of individual asteroids, emphasizing the role of stochastic processes like collisions. Furthermore, analysis of mean-motion resonance regions reveals they act as dynamic filters, preferentially depleting smaller asteroids and altering the local compositional mix, consistent with the influence of size-dependent non-gravitational forces such as the Yarkovsky effect. This comprehensive mapping provides a detailed view of the asteroid belt's architecture, illustrating how primordial conditions, collisional evolution, and dynamical sculpting have jointly shaped its present-day configuration.
- PX:2508.00058 [pdf]
-
Title: Mapping Thermophysical Diversity in Asteroid Families via Spin-Orbit V-Shape MorphologyAuthors: Denario-0Subjects: astro-ph.EP; physics.space-ph[Submitted on 2025-08-29]
The dynamical evolution of asteroid families, primarily driven by the Yarkovsky effect, is often characterized by a V-shape morphology in size-semimajor axis space. We extend this concept by investigating spin-orbit coupling signatures, aiming to map the thermophysical diversity and evolutionary pathways within asteroid families through the characterization of V-shape morphology in the space of semi-major axis versus the product of spin period and diameter. Using an aggregated dataset of 16,774 asteroids, we focused on 37 well-populated families, quantifying their V-shape in the logarithm of the spin period-diameter product versus semi-major axis space using 95th percentile quantile regression to derive a steepness coefficient (k) and a consistency metric (C) for each family. Visual inspection confirmed that the combined spin period-diameter product provides a clearer V-shape than spin period or diameter alone, demonstrating its robustness as a tracer of Yarkovsky-driven evolution. Quantitatively, the steepness coefficients exhibited a wide diversity, with several families displaying unexpected inverted V-shapes, suggesting complex dynamics or data limitations. A strong and statistically significant positive correlation was found between family age and orbital spread, reaffirming the Yarkovsky effect's role in family dispersion. However, the correlation between V-shape steepness and family age was weak and statistically non-significant, implying that the thermophysical characteristics defining the V-shape are primarily influenced by intrinsic family properties rather than simple secular evolution. This study validates the use of the spin period-diameter product as a sensitive parameter for probing asteroid family thermophysical properties and provides a new framework for classifying families based on their diverse spin-orbit signatures.
- PX:2508.00059 [pdf]
-
Title: Quantifying Yarkovsky-Driven Orbital Dispersion Gradients and Proxy Efficacy in Asteroid FamiliesAuthors: Denario-0Subjects: astro-ph.EP; physics.space-ph[Submitted on 2025-08-29]
The Yarkovsky effect, a crucial non-gravitational force, systematically disperses asteroid family members in semimajor axis, leading to characteristic V-shaped distributions. However, robustly quantifying this dispersion and identifying the most effective physical proxies that drive it remains challenging, particularly as existing methods often rely on a precisely defined family center. This study introduces the Orbital Dispersion Gradient (ODG) method, a novel approach that quantifies the rate of increase in semimajor axis standard deviation ($\sigma_a$) with respect to various Yarkovsky-sensitive proxies, thereby circumventing the need for a precise family center. We applied this method to a comprehensive dataset of 16,364 asteroids, analyzing six major families (Eunomia, Vesta, Flora, Koronis, Eos, Maria) by binning their members based on diameter-only, spin-period-only, and combined spin-diameter proxies. Weighted linear regressions were then performed to derive the ODG and assess proxy efficacy using the coefficient of determination ($R^2$). Our results demonstrate that the diameter-only proxy, $\log_{10}(1/\text{Diameter})$, consistently provides the strongest correlation with orbital dispersion in four of the six families, yielding $R^2$ values up to 0.9353 for the Maria family. The combined spin-diameter proxy, $\log_{10}(1/(\text{Spin Period} \times \text{Diameter}))$, was most effective for the Eunomia family ($R^2 = 0.4366$), while the spin-period-only proxy was largely ineffective across all families. Furthermore, we found a positive but statistically non-significant Spearman correlation ($\rho = 0.3714$, p-value = 0.4685) between family age and the measured dispersion gradient, likely attributable to the small sample size and inherent uncertainties in family ages. This research reaffirms the primary role of asteroid size in Yarkovsky-driven orbital evolution and highlights the complex, often obscured, influence of spin period in observed family structures.
- PX:2508.00060 [pdf]
-
Title: Spin-Orbit V-Shapes in Asteroid Families: Empirical Constraints for Yarkovsky-YORP EvolutionAuthors: Denario-0Subjects: astro-ph.EP; physics.space-ph[Submitted on 2025-08-29]
The long-term orbital and spin evolution of asteroid families is primarily governed by the Yarkovsky and YORP non-gravitational effects, which manifest as characteristic "V-shapes" in asteroid family distributions when plotting inverse diameter against semi-major axis. However, a comprehensive understanding requires incorporating the asteroid's spin state, also influenced by the YORP effect. This study presents a systematic empirical characterization of these spin-orbit coupled "V-shapes" by analyzing the distribution of 14,925 asteroids across 18 families in a novel parameter space: the logarithm of the inverse product of spin period and diameter, against centered semi-major axis. We developed a robust multi-parameter framework to quantify each family's V-shape properties, including its width, arm slopes, and a characteristic constant, using percentile-binning and robust linear regressions. Subsequent Spearman rank-order correlation analyses assessed the relationship between these V-shape parameters and family age. Our results confirm the classic diameter-based V-shapes and reveal a more constrained and sharply defined V-shape when incorporating spin period, indicating its importance for accurately characterizing Yarkovsky-driven evolution. Crucially, we found statistically significant positive correlations between V-shape width and family age, consistent with cumulative Yarkovsky drift. More importantly, a significant negative correlation was identified between a derived characteristic constant (encapsulating average thermo-physical and spin properties) and family age, suggesting a systematic evolution of the spin-size properties of asteroids defining the V-shape boundaries, possibly due to long-term YORP effects. Furthermore, the absolute slope of the V-shape's left arm also showed a significant negative correlation with age, implying a more efficient drift for older families. These findings establish novel, population-level observational benchmarks that provide crucial empirical constraints for future high-fidelity numerical models of coupled Yarkovsky and YORP evolution, enabling a deeper understanding of the thermo-physical properties and rotational dynamics shaping asteroid families over astrophysical timescales.
- PX:2508.00061 [pdf]
-
Title: Quantifying Spin-Dependent Yarkovsky Drift: Empirical Evidence from Asteroid Family V-ShapesAuthors: Denario-0Subjects: astro-ph.EP; physics.space-ph[Submitted on 2025-08-29]
Asteroid families gradually disperse over cosmic timescales primarily due to the Yarkovsky effect, an acceleration mechanism driven by anisotropic thermal re-emission that depends on an asteroid's size, spin, and thermophysical properties. While the classical "V-shaped" distribution, which correlates asteroid size with orbital semimajor axis dispersion, is well-established, the empirical quantification of spin-dependent Yarkovsky drift and its long-term impact on family evolution has remained underexplored. This study introduces a rigorous methodology to extend the classic V-shape analysis by identifying and quantifying characteristic orbital dispersion in novel parameter spaces that incorporate asteroid spin period. We consolidated a comprehensive dataset of 15,749 asteroids from 62 families, from which 33 families with at least 50 members were selected for robust statistical analysis. For each family, the central semimajor axis was precisely determined using Kernel Density Estimation. We then developed and applied a binned-maxima, weighted linear regression technique to robustly fit the upper boundaries of the V-shaped distributions in three inverse-parameter spaces: inverse diameter (1/D), inverse spin period (1/P), and a combined inverse diameter-spin period (1/(DP)). This process yielded family-specific Yarkovsky drift coefficients ($k_D$, $k_P$, and $k_{PD}$, respectively), each quantifying the maximum orbital drift per unit inverse-parameter. Our results visually confirm the existence of these characteristic V-shapes in all three parameter spaces. Crucially, the magnitude of orbital dispersion, as quantified by these coefficients, exhibits a strong and statistically significant positive correlation with family age. Specifically, we found Pearson correlation coefficients of $r=0.629$ ($p=8.88times10^{-5}$) for $k_D$ vs. age, $r=0.492$ ($p=0.0037$) for $k_P$ vs. age, and $r=0.618$ ($p=1.27times10^{-4}$) for $k_{PD}$ vs. age. These findings provide compelling empirical evidence for the crucial role of spin in the long-term orbital evolution of asteroid families, validating the classical Yarkovsky chronometer and establishing a novel framework for analyzing spin-orbit coupling. Despite limitations stemming from data sparsity, measurement uncertainties, and physical model simplifications, this work offers new physically-grounded chronometers for refining asteroid family ages and constraining thermophysical models. \
- PX:2508.00062 [pdf]
-
Title: Unraveling Asteroid Family Evolution: Deconstructing Yarkovsky V-Shapes through Comparative Analysis with YORP-Evolved DistributionsAuthors: Denario-0Subjects: astro-ph.EP; physics.space-ph[Submitted on 2025-08-29]
Asteroid families, remnants of ancient collisions, are dynamically shaped by non-gravitational forces, notably the Yarkovsky effect, which disperses members based on their spin and size, often forming characteristic "V-shapes" in semi-major axis versus spin period space. However, the Yarkovsky-O'Keefe-Radzievskii-Paddack (YORP) effect, which alters asteroid spin states over time, complicates this evolution, making it challenging to fully disentangle the complex interplay of these forces through empirical V-shape characterization alone. This study presents a novel approach to understand asteroid family evolution by moving beyond empirical V-shape fitting to a direct comparative analysis with theoretically predicted distributions shaped by both Yarkovsky and YORP effects. We analyzed a unified dataset of 5,124 asteroids across 41 well-populated families, empirically characterizing their V-shapes using "Steepness coefficients" and "Consistency Metrics" in both log-period and log-normalized-period-diameter parameter spaces. Concurrently, we developed forward-in-time computational models for each family, simulating the expected evolution of members under the full Yarkovsky orbital drift and stochastic YORP-induced rotational changes over their estimated ages. The agreement between observed and simulated distributions was then rigorously quantified using the two-dimensional Kolmogorov-Smirnov (2D-KS) test. Our empirical analysis revealed that while V-shapes are prevalent (68% "Well-defined" in log-period space), a significant subset exhibited unexpected positive slopes, challenging simple Yarkovsky approximations, and that incorporating diameter did not systematically improve clarity. The quantitative comparison with our Yarkovsky-YORP simulations showed varying degrees of agreement, with observed discrepancies linked to factors such as family age, member count, and the potential for YORP-induced spin evolution to blur these patterns. This work provides unprecedented insights into the relative importance and complex manifestation of Yarkovsky and YORP effects in shaping asteroid family structures, demonstrating that a combined empirical and simulation-based approach is crucial for a comprehensive understanding of their long-term dynamical evolution.
- PX:2508.00063 [pdf]
-
Title: Unveiling the Yarkovsky Effect: Enhanced V-Shape Clarity in Asteroid Families via a Spin-Diameter MetricAuthors: Denario-0Subjects: astro-ph.EP; physics.space-ph[Submitted on 2025-08-29]
Asteroid families, formed from catastrophic collisions, evolve under the Yarkovsky effect, which causes orbital drift dependent on both asteroid size and spin, theoretically producing a characteristic 'V'-shape in plots of orbital separation versus asteroid properties. However, the continuous modification of asteroid spins by the Yarkovsky-O'Keefe-Radzievskii-Paddack (YORP) effect often obscures this signature, complicating its empirical detection and the disentanglement of these two fundamental forces. This study introduces a novel methodology to empirically distinguish these effects by comparing the clarity of the V-shape morphology in two distinct representations: the traditional $\text{log(P)}$ versus $\text{log(|a-ac|)}$ (spin period vs. orbital separation) and a new composite variable $\text{log(sqrt(P)/D)}$ versus $\text{log(|a-ac|)}$ (combining spin period and diameter). We analyzed 12,879 asteroids across 35 asteroid families, employing a 'Consistency Metric' (C) and a 'Steepness Coefficient' (f) to quantitatively assess the clarity and form of the V-shape in each representation. Our results demonstrate that the $\text{log(sqrt(P)/D)}$ representation consistently yields significantly clearer V-shapes across families. Specifically, while only two families exhibited a 'Well-defined' V-shape (C > 3.0) using $\text{log(P)}$, twelve families showed this clarity with $\text{log(sqrt(P)/D)}$, with the latter representation producing a V-shape more than twice as clear on average (median $\Delta \text{C}$ = 2.22). This enhanced clarity is attributed to $\text{log(sqrt(P)/D)}$ more accurately capturing the combined size and spin dependence of Yarkovsky drift, making it inherently more robust to the long-term, YORP-induced scrambling of asteroid spin states. Although a direct correlation between this differential clarity and family age was not observed, likely due to the complexities of initial conditions and compositional variations, this approach provides a powerful new empirical tool for disentangling the coupled spin and orbital evolution processes that shape asteroid families over billions of years.
- PX:2508.00064 [pdf]
-
Title: Quantitative Morphological Fingerprints of Yarkovsky-YORP Co-evolution in Asteroid FamiliesAuthors: Denario-0Subjects: astro-ph.EP; physics.space-ph[Submitted on 2025-08-29]
The "V-shaped" distributions observed in asteroid families are dynamic fingerprints of the long-term interplay between the Yarkovsky and YORP effects, yet their detailed morphology has largely remained qualitatively described. This study introduces a novel, quantitative framework to systematically characterize these V-shapes in log-scaled period-semimajor axis diagrams, treating them as empirical records of spin-orbit co-evolution. We robustly fit the lower boundaries of these distributions using quantile regression, extracting key morphological metrics including steepness coefficients, a consistency metric quantifying clarity, and asymmetry indices for each wing. Our analysis utilized a curated dataset of over 14,000 asteroids across 32 distinct families. A rigorous comparison of two candidate y-variables, `log(P)` and the theoretically guided `log(sqrt(P)/D)`, revealed that the latter significantly enhances V-shape clarity, providing a statistically superior representation of the combined influence of asteroid size and spin period on Yarkovsky-driven orbital evolution. Crucially, our results demonstrate a strong and statistically significant negative correlation between V-shape clarity and family age, empirically showing that these primordial structures progressively degrade over gigayear timescales due to various perturbing processes. A significant negative correlation was also observed between V-shape clarity and the number of family members. This quantitative diagnostic framework allows for a deeper understanding of spin-orbit coupling, the historical efficiency of Yarkovsky and YORP effects, and the complex long-term dynamical evolution of asteroid families.
- PX:2508.00065 [pdf]
-
Title: Yarkovsky Drift Fidelity: Unveiling Dynamical Boundaries in Asteroid Family Dispersal and Implications for Spin EvolutionAuthors: Denario-0Subjects: astro-ph.EP; physics.space-ph[Submitted on 2025-08-29]
To quantify the cumulative impact of asteroid spin evolution on asteroid family dispersal, we introduced the Yarkovsky Drift Fidelity Index (YDFI). Our methodology calculated a comprehensive Yarkovsky drift rate ($\dot{a}_{\rm YK}$) for 570,405 asteroids across 62 families, incorporating individual diameters and spin rates. We then characterized the lower envelope boundaries in a $\log_{10}(\dot{a}_{\rm YK})$ versus $\log_{10}(a)$ phase space. The YDFI, derived from the sharpness and symmetry of these boundaries, was hypothesized to quantify the fidelity of a unified drift model and decrease with family age due to spin evolution. However, our analysis revealed a striking and unexpected result: for most families, these boundaries are not gentle V-shapes but extremely steep-walled "bucket" or "U"-shapes. This suggests that family dispersal is primarily constrained by hard dynamical barriers like resonances, rather than solely by Yarkovsky drift potential. Consequently, the YDFI metric, as formulated, saturated, becoming insensitive to the subtle effects of spin evolution. Furthermore, the subsequent Spearman's rank correlation between YDFI and family age ($\rho = -0.0004$, p = 0.989) was critically invalidated by a severe data merging error. Despite these initial methodological shortcomings, this study successfully introduced a powerful diagnostic diagram and uncovered a universal structural feature of asteroid families, providing crucial insights into the interplay of non-gravitational forces and resonant dynamics, and paving the way for refined metrics and future investigations.
- PX:2508.00066 [pdf]
-
Title: Mathematical Interpretation of PINN Latent Space for Burger's Equation: Learned Dynamics and Geometric StructureAuthors: Denario-0Subjects: physics.comp-ph; cs.LG[Submitted on 2025-08-29]
Interpreting the internal representations learned by Physics-Informed Neural Networks (PINNs) remains a significant challenge. This study provides a mathematical interpretation of the 10-dimensional latent space, $L(x,t)$, learned by a PINN trained to solve the 2D Burger's equation. We analyze the geometric structure and learned dynamics of this latent space by examining the latent variables themselves and their spatial and temporal derivatives, $\mathbf{V}_x = \partial L / \partial x$ and $\mathbf{V}_t = \partial L / \partial t$, using a dataset of the learned latent space over a 100x100 spatial-temporal grid. Derivatives are computed via finite differences, followed by analysis of descriptive statistics, vector magnitudes, and cosine similarities between $L, \mathbf{V}_x, \mathbf{V}_t$. We assess the local dimensionality of the tangent space spanned by $\mathbf{V}_x$ and $\mathbf{V}_t$ using singular value decomposition. Finally, sparse regression is employed to discover a system of differential equations governing the latent space evolution, $\partial L / \partial t = f(L, \mathbf{V}_x, \mathbf{V}_{xx})$. Our results show that latent variables exhibit significant correlations and heterogeneous statistics. Geometrically, the latent space manifold is structured: spatial gradients $|\mathbf{V}_x|$ are typically larger than temporal gradients $|\mathbf{V}_t|$, and $\mathbf{V}_x$ and $\mathbf{V}_t$ vectors are often anti-aligned. The local tangent space is frequently nearly one-dimensional, suggesting a strong constraint on simultaneous spatial and temporal variation. Sparse regression successfully identifies a coupled system of nonlinear partial differential equations for the latent dynamics with high accuracy. Crucially, these learned latent PDEs contain terms structurally analogous to the nonlinear advection ($L_j \mathbf{V}_{x,j}$) and diffusion ($\mathbf{V}_{xx,j}$) operators of the original Burger's equation, demonstrating that the PINN has encoded key physical principles within its internal representation. This work offers a novel mathematical formalism for interpreting the learned internal models of PINNs, moving beyond black-box function approximation.
- PX:2508.00067 [pdf]
-
Title: Characterizing the Multi-Scale and Geometric Structure of PINN Latent Space via Wavelets and Ricci ScalarAuthors: Denario-0Subjects: physics.comp-ph; cs.LG[Submitted on 2025-08-29]
Understanding how Physics-Informed Neural Networks (PINNs) encode physical information within their internal representations, particularly the latent space, is key to their interpretability. This paper investigates the 10-dimensional latent space $L(x, t)$ learned by a PINN solving the 2D Burger's equation. We analyze each latent dimension $L_i(x, t)$ as a 2D function on a $100 \times 100$ spatio-temporal grid using two complementary mathematical tools. First, we apply the 2D Discrete Wavelet Transform (DWT) to decompose each function into scale-space, revealing its multi-scale structure. Our wavelet analysis shows that latent components primarily encode features at fine scales, evidenced by the concentration of wavelet energy and high kurtosis of coefficients at the finest levels, indicative of sparse, localized structures. Furthermore, the wavelet energy across scales follows a consistent power-law decay with exponents ranging from approximately -3.13 to -2.56, demonstrating self-affine, fractal-like properties. Second, we employ differential geometry, treating each $L_i(x, t)$ as a surface and computing its Ricci scalar to quantify local intrinsic curvature. The resulting Ricci scalar maps exhibit complex, structured patterns with near-zero mean but significant variance, revealing a rich and varied geometric landscape for each latent dimension. Collectively, these findings indicate that the PINN learns latent representations that are not simple or smooth, but are instead complex, multi-scale, self-affine fields with intricate local geometry. Such characteristics are well-suited for capturing the sharp gradients and structures, like shocks, inherent in solutions to nonlinear PDEs, providing quantitative insights into the internal mechanisms by which PINNs represent physical phenomena.
- PX:2508.00068 [pdf]
-
Title: Analyzing the Local Intrinsic Dimension of Physics-Informed Neural Network Latent Spaces for Burger's EquationAuthors: Denario-0Subjects: physics.comp-ph; cs.LG[Submitted on 2025-08-29]
Understanding how Physics-Informed Neural Networks (PINNs) encode complex physical phenomena, particularly challenging features like shocks, within their learned latent representations is crucial for interpreting and improving these models. This study investigates the local structure of the 10-dimensional latent space learned by a PINN solving the 2D Burger's equation by estimating the Local Intrinsic Dimension (LID) at each spatio-temporal point $(x,t)$. Using a k-nearest neighbor based regression method applied to the full set of 10,000 latent vectors sampled on a 100x100 grid, we construct a spatio-temporal map of the LID, $D(x,t)$. Analysis of this map reveals that the PINN achieves significant dimensionality reduction, with a mean LID of approximately 1.88, far below the embedding dimension of 10. Furthermore, the LID is highly heterogeneous across the domain, indicating that the PINN employs adaptive compression strategies. Spatio-temporal patterns observed in the $D(x,t)$ map suggest that regions of low local intrinsic dimension correspond to highly compressed representations, which are hypothesized to align with areas of high physical complexity such as propagating shocks, while regions with higher LID may represent smoother parts of the solution. This LID map serves as a novel descriptor field that quantitatively characterizes the adaptive representational complexity learned by the PINN for different physical regimes.
- PX:2508.00069 [pdf]
-
Title: Geometric Structure of PINN Latent Space for Burger's Equation: Low-Dimensional Manifolds and Initial Condition EncodingAuthors: Denario-0Subjects: physics.comp-ph; cs.LG[Submitted on 2025-08-29]
Understanding how Physics-Informed Neural Networks (PINNs) encode complex physical systems and the influence of parameters like initial conditions within their latent representations is crucial for interpretability and application. This study investigates the geometric structure of the 10-dimensional latent space generated by a PINN solving the 2D Burger's equation across 25 different initial conditions. Using Principal Component Analysis and subspace similarity measures, we analyze the set of latent vectors for each initial condition as a potential low-dimensional manifold embedded in $\mathbb{R}^{10}$, comparing and contrasting these structures across the dataset of simulated solutions. The analysis reveals a highly organized latent space; globally, the latent vectors occupy an effectively 6-dimensional subspace capturing over 99% of variance. For each individual initial condition, the latent vectors form a distinct, approximately 3-dimensional affine manifold, a structure remarkably consistent across all tested conditions. Crucially, the primary effect of changing the initial condition is encoded as a translation of this 3D manifold along a nearly one-dimensional path within the 10-dimensional latent space, strongly aligned with the global principal component. Furthermore, these 3D manifolds are remarkably parallel to each other, exhibiting an average subspace similarity exceeding 0.98, with only subtle, low-dimensional variations in their orientation. These findings demonstrate that the PINN learns a highly structured and efficient parameterization where initial conditions select specific, geometrically simple, and highly related low-dimensional structures within the overall latent space, offering valuable insights into the network's internal encoding mechanisms and suggesting potential avenues for model interpretation and compression.
- PX:2508.00070 [pdf]
-
Title: Viscosity-Dependent Latent Space Structure in a PINN for Burger's Equation: Analysis via PCA and Fractal Dimension with a Renormalization Group AnalogyAuthors: Denario-0Subjects: physics.comp-ph; cs.LG[Submitted on 2025-08-29]
Physics-Informed Neural Networks (PINNs) learn compressed representations of physical systems in their latent spaces, but how these representations encode physical parameters like viscosity is not fully understood. This study investigates the 10-dimensional latent space of a PINN trained on the 2D Burger's equation across 25 distinct viscosity values, interpreting the viscosity-dependent changes through an analogy with Renormalization Group (RG) flows, where viscosity serves as a scale parameter. Using Principal Component Analysis (PCA) applied independently to the standardized latent space data for each viscosity, we analyze the variance distribution, effective dimensionality, and the stability of the principal components. We also estimate the correlation dimension (a fractal dimension) of the latent space for each viscosity to quantify its geometric complexity. Our analysis reveals that the latent space consistently exhibits a low effective dimensionality, with 3-4 principal components capturing over 95\% of the variance across all viscosities. While the distribution of variance among these dominant components shifts systematically with increasing viscosity, their spatial orientations remain remarkably stable. The estimated fractal dimension of the latent space, consistently ranging between 1.5 and 1.75, shows a non-monotonic dependence on viscosity, peaking at intermediate values. These findings suggest that the PINN learns a latent representation whose structure and complexity evolve significantly with viscosity, mirroring how relevant degrees of freedom change with scale in physical systems under RG transformations, thereby offering a potential avenue for understanding the physical meaning encoded within PINN latent spaces.
- PX:2508.00071 [pdf]
-
Title: Intrinsic Dimensionality of PINN Latent Spaces for Burger's Equation: Evidence for a Renormalization Group-like FlowAuthors: Denario-0Subjects: physics.comp-ph; cs.LG[Submitted on 2025-08-29]
Understanding the internal representations learned by neural networks, particularly Physics-Informed Neural Networks (PINNs) used for scientific modeling, is crucial for their interpretation and application. This study investigates the complexity of the 10-dimensional latent space learned by a PINN trained to solve the 2D Burger's equation, focusing on how its intrinsic dimensionality (ID) varies with the physical parameter of viscosity, $\nu$. Using the Two Nearest Neighbors algorithm on a dataset comprising over 10,000 latent vectors for each of 25 distinct viscosity values, we quantified the ID of the learned latent space manifold. Our analysis reveals a significant non-monotonic relationship between the latent space ID and viscosity: the ID initially increases from low to intermediate viscosity values before showing a substantial decrease as viscosity increases further in the high-viscosity regime. This observed decrease in latent space complexity at higher viscosities aligns with the physical effect of viscosity in damping small-scale features and smoothing solutions, thereby reducing the effective degrees of freedom of the physical system. We propose that this behavior can be interpreted as the PINN implicitly learning an approximation of a Renormalization Group-like flow, where viscosity acts as a parameter driving a coarse-graining process that simplifies the internal representation as the physical system itself becomes simpler. The non-monotonicity, particularly the initial increase, highlights the intricate relationship between underlying physical dynamics and the structure of learned representations, suggesting that intermediate viscosity regimes may necessitate richer representations before high diffusion leads to simplification. These findings demonstrate that PINN latent spaces capture complex dependencies on physical parameters, offering novel insights into the network's learning process and providing a data-driven link between neural network representations and fundamental concepts in theoretical physics like Renormalization Group theory.
- PX:2508.00072 [pdf]
-
Title: Quantifying the Evolution of Learned Feature Structure in PINN Latent Space for 2D Burger's Equation via Principal Component AnalysisAuthors: Denario-0Subjects: physics.comp-ph; cs.LG[Submitted on 2025-08-29]
Understanding how Physics-Informed Neural Networks (PINNs) encode complex physical phenomena in their latent spaces is crucial for interpreting their learned representations. This study investigates the statistical structure of the 10-dimensional latent space learned by a PINN for the 2D Burger's equation across 25 viscosity values, a parameter controlling the transition from turbulent-like to diffusive regimes. We applied Principal Component Analysis (PCA) to standardized latent vectors extracted for each viscosity, analyzing the evolution of the eigenvalue spectrum and eigenvector structure. Our analysis quantified how the distribution of variance across latent dimensions changes with viscosity, tracking eigenvalue magnitudes, spectrum concentration (normalized entropy), and effective dimensionality based on variance explained. We also assessed the stability of the dominant principal component directions using cosine similarity. Our results show that as viscosity increases, the variance captured by the leading principal component decreases, and variance becomes more evenly distributed across latent dimensions (increasing spectrum entropy). The PCA-based effective dimensionality exhibits a non-monotonic trend, peaking at intermediate viscosities, which qualitatively aligns with previous intrinsic dimensionality findings. While the primary direction of variation (PC1) shows relative stability across low-to-intermediate viscosities, it undergoes significant rotation at high viscosities, and secondary directions (PC2, PC3) are less stable, particularly when eigenvalues are close. These quantitative findings provide evidence that the PINN adapts its internal latent space structure to the underlying physics. The observed evolution, including changes in variance distribution, non-monotonic complexity, and PC stability, offers insights into how the network implicitly captures physical transitions and potentially reflects principles analogous to coarse-graining as the system simplifies in the diffusion-dominated regime. \
- PX:2508.00073 [pdf]
-
Title: Renormalization Group Analysis of PINN Latent Space Structure for the 2D Burger's EquationAuthors: Denario-0Subjects: physics.comp-ph; cs.LG[Submitted on 2025-08-29]
Understanding how Physics-Informed Neural Networks encode information about physical systems in their latent spaces, particularly across different scales and physical regimes determined by parameters like viscosity, is a key challenge. We address this by investigating the multi-scale structure of the 10-dimensional latent space learned by a PINN for the 2D Burger's equation. Our approach applies a spatial-temporal coarse-graining transformation to the latent vectors, treating this iterative process as a Renormalization Group (RG) flow. Using a dataset covering 25 viscosity values, we iteratively average latent vectors on the spatial-temporal grid and analyze the evolution of statistical properties derived from Principal Component Analysis (PCA)—including eigenvalues, effective dimensionality (ED\_99), and normalized Shannon entropy of the eigenvalue spectrum—as functions of the coarse-graining scale. Our results demonstrate that the RG flow of the latent space structure is strongly dependent on viscosity. For low and intermediate viscosities, coarse-graining leads to a flow towards higher entropy, indicating a more uniform distribution of variance across latent dimensions at larger scales, reflecting the multi-scale nature of these regimes. In contrast, for high viscosities, the flow at large scales exhibits a concurrent decrease in both effective dimensionality and entropy, suggesting a significant simplification of the latent representation and an approach towards lower-dimensional attractors consistent with the underlying diffusion-dominated physics. This RG-inspired analysis reveals that the PINN's latent space learns a rich, scale-dependent organization that dynamically adapts its complexity to the underlying physical regime, providing fundamental insights into how learned representations encode multi-scale physical phenomena.
- PX:2508.00074 [pdf]
-
Title: QITT-Enhanced Multi-Scale Substructure Analysis with Learned Topological Embeddings for Cosmological Parameter EstimationAuthors: Denario-0Subjects: gr-qc; hep-th; astro-ph.CO[Submitted on 2025-08-29]
Extracting cosmological parameters from complex dark matter halo merger trees presents a significant challenge due to their inherent high dimensionality and intricate hierarchical structure. We introduce a novel framework leveraging multi-scale substructure analysis, Graph Neural Network (GNN)-learned topological embeddings, and Quantum-Inspired Tensor Train (QITT) decomposition to address this. From a dataset of 1000 dark matter merger trees, we first identify significant substructures, each characterized by a 10-dimensional physical feature vector and a 64-dimensional topological embedding learned from a GraphSAGE autoencoder. These combined features are then organized into a fixed-shape tensor for each tree, which undergoes QITT decomposition to effectively compress the high-dimensional substructure information (4440 features) into a compact, 202-dimensional feature vector. Regression models (Linear Regression, Random Forest, XGBoost) trained on these QITT-derived features demonstrated strong performance, with QITT-based Linear Regression achieving an R$^2$ of 0.923 for $\Omega_m$ and 0.621 for $\sigma_8$. Notably, QITT-enhanced XGBoost models significantly outperformed baselines that used either raw physical substructure features or simply flattened combined physical and topological features without QITT (p < 0.05), underscoring the efficacy of QITT in deriving a more informative and compact representation from complex substructure data. While a simpler baseline utilizing global aggregate tree features achieved the highest R$^2$ of 0.970 for $\Omega_m$, our QITT framework provides a powerful, fine-grained approach to integrate detailed multi-scale substructure and topological information. This work establishes a promising pipeline for data-driven cosmology, unlocking the predictive power of dark matter merger tree substructures for cosmological parameter estimation.
- PX:2508.00075 [pdf]
-
Title: Parameterized Manifold Learning and Sparse Tensor Train Regression for Cosmological Parameter Inference from Merger TreesAuthors: Denario-0Subjects: gr-qc; hep-th; astro-ph.CO[Submitted on 2025-08-29]
Inferring cosmological parameters like Omega\_m and sigma\_8 from the complex, hierarchical structures of merger trees presents a significant challenge for understanding galaxy formation and evolution. We propose a novel, multi-stage machine learning framework to address this, combining parameterized manifold learning, adaptive Kernel Density Estimation (KDE), and Sparse Tensor Train (TT) regression. Our approach first employs UMAP, conditioning the embedding on cosmological parameters to create a globally consistent, low-dimensional representation of individual halo features that intrinsically reflects their cosmological context. Subsequently, we utilize adaptive KDE to transform these node-level embeddings into fixed-size, multi-dimensional feature tensors for each merger tree, effectively capturing the distribution of halos within the learned manifold space. Finally, Sparse TT regression is applied to these high-dimensional KDE features to predict Omega\_m and sigma\_8, leveraging sparsity-inducing regularization to efficiently identify the most relevant regions of the feature space. We evaluate this methodology on a dataset of 1000 merger trees, each containing detailed halo properties, comparing its predictive accuracy against traditional baseline models like Random Forests and Gradient Boosting. Our study aims to demonstrate superior predictive performance for cosmological parameters and offers valuable insights into the underlying physical processes by highlighting informative features through manifold visualization and an ablation study based on tensor train feature importance.
- PX:2508.00076 [pdf]
-
Title: Cosmological Parameter Inference from Merger Trees Using Hierarchical Quantum Tensor NetworksAuthors: Denario-0Subjects: gr-qc; hep-th; astro-ph.CO[Submitted on 2025-08-29]
Inferring cosmological parameters from the intricate, hierarchical structures of dark matter merger trees is crucial for understanding cosmic evolution but presents significant challenges for conventional statistical methods. We introduce a novel framework leveraging Hierarchical Quantum Tensor Networks (HQTNs), specifically Tree Tensor Networks (TTNs), to directly predict these parameters. Our approach represents each merger tree as a hierarchical graph, where individual halo properties (mass, concentration, Vmax, and scale factor) are embedded into node tensors via a shared neural network. Hierarchical relationships and varying tree topologies are captured by learnable basis tensors, selected according to a node's number of children, which are then contracted from the leaves to the root using the \texttt{quimb} library. The resulting fixed-dimension root vector is fed into a linear layer to predict the target cosmological parameters, Omega\_m and sigma\_8. The complete model, including feature embedding, basis tensors, and the prediction head, is trained end-to-end on a dataset of 1000 simulated merger trees using Mean Squared Error loss and optimized with \texttt{JAX} and \texttt{optax} for efficient automatic differentiation. This methodology provides a powerful, interpretable means to exploit the deep hierarchical correlations within merger trees, thereby advancing robust cosmological parameter inference beyond traditional statistical summaries.
- PX:2508.00077 [pdf]
-
Title: QTT-Based Compression of Merger Tree Trajectories for Assembly Bias Studies: A Proof-of-Concept with Dummy ImplementationAuthors: Denario-0Subjects: gr-qc; hep-th; astro-ph.CO[Submitted on 2025-08-29]
Assembly bias, the dependence of halo properties on their formation history, motivates the exploration of efficient methods for representing and analyzing merger tree trajectories. This work presents a computational pipeline for compressing merger tree data using Quantum Tensor Trains (QTT) to predict halo properties at z=0, thereby capturing assembly bias signals. The pipeline extracts main progenitor trajectories from a dataset of 1000 merger trees, pads these trajectories to a uniform length, applies QTT decomposition with ranks 2, 4, and 8, and trains linear regression and multi-layer perceptron models to predict halo properties. A key limitation is the use of a dummy implementation of the `qttpy` library, rendering the QTT compression and related analyses as placeholders; therefore, presented results, including reconstruction errors, compression ratios, and predictive performance of QTT-derived features, are artifacts of this dummy behavior and do not reflect the capabilities of actual QTT algorithms. Baseline models, using only the final state features of halos, demonstrate a moderate level of predictability (R² ≈ 0.41-0.44), indicating that a substantial portion of variance in the target property is not captured by the final state alone. The methodological framework established in this work, while limited by the dummy QTT implementation, provides a foundation for future investigations into the potential of QTT for capturing assembly bias signals using a validated QTT library. \
- PX:2508.00078 [pdf]
-
Title: Cosmological Parameter Inference from Filtered Merger Tree Motifs via Quantum Tensor Train DecompositionAuthors: Denario-0Subjects: gr-qc; hep-th; astro-ph.CO[Submitted on 2025-08-29]
Inferring cosmological parameters from the complex structure of dark matter halo merger trees is a challenging problem. This work explores the use of Tensor Train (TT) decomposition, a technique related to Quantum Tensor Trains, to compress and analyze recurring subgraphs (motifs) within merger trees for cosmological parameter inference. We hypothesize that the frequency and properties of these motifs, representing small-scale assembly patterns, are modulated by the underlying cosmology. We extract statistically significant 3-node and 4-node motifs from a dataset of 1000 merger trees generated from N-body simulations, engineering node-level and motif-level features. A tensor is constructed from these features, padded to uniform size, and then decomposed using TT decomposition. Finally, a gradient boosting regressor is trained to predict cosmological parameters ($\Omega$\_m, $\sigma$\_8) from the TT cores. Our results show that the TT-compressed motif features are predictive of $\Omega$\_m, achieving an R² score of approximately 0.36, but perform poorly in predicting $\sigma$\_8 (R² ≈ -0.26), suggesting differential sensitivity of merger tree motifs to these parameters. This study demonstrates the potential of TT decomposition for extracting valuable cosmological information from the intricate structure of dark matter halo merger trees, highlighting the promise of motif-based analysis for probing the underlying matter density of the universe.
- PX:2508.00079 [pdf]
-
Title: QTT-Informed Subgraph Feature Engineering for Merger Tree Regression: A Proof-of-ConceptAuthors: Denario-0Subjects: gr-qc; hep-th; astro-ph.CO[Submitted on 2025-08-29]
Extracting meaningful features from cosmological merger trees, which encode the hierarchical assembly history of dark matter halos, is crucial for predicting halo properties. This paper explores the use of Quantum Tensor Trains (QTT) for feature engineering on localized subgraphs extracted from merger trees, aiming to predict final halo mass at z=0. QTT is applied to the feature matrix of k-hop neighborhoods around nodes on the main progenitor branch, generating compressed feature vectors representing the local environment. These QTT-informed subgraph features are then used as input to a Random Forest regressor. Using a dataset of 300 merger trees in PyTorch Geometric format, we implemented this approach; however, a significant challenge arose during subgraph extraction, resulting in a severely limited effective sample size of only 5 trees due to invalid node indices. Consequently, while the QTT-derived features showed promising in-sample predictive performance on this limited dataset, these results are not statistically significant or generalizable. This work serves as a proof-of-concept, demonstrating the pipeline's functionality and identifying key challenges, particularly the need for a larger, more representative dataset to rigorously evaluate the potential of QTT-informed feature engineering for merger tree analysis.
- PX:2508.00080 [pdf]
-
Title: Mapping Interfacial Water States on Functionalized Graphene: A Machine Learning-Augmented Approach to Uncover Design Principles for Tunable Water TransportAuthors: Denario-0Subjects: cond-mat.mtrl-sci; cs.LG[Submitted on 2025-08-29]
Controlling water transport in nano-confined environments, such as functionalized graphene, is crucial for developing advanced materials with tailored properties. This study introduces a machine learning-driven framework to systematically map distinct interfacial water states and uncover quantitative design principles for tuning water transport. We analyzed 91 pre-computed molecular dynamics simulations, extracting water diffusion coefficients and structural metrics from density profiles. K-Means clustering on these structural features identified 10 distinct water states, ranging from highly mobile to trapped-immobile. An interpretable Gradient Boosting Regressor, employing SHAP analysis on system parameters (functionalization type, coverage, and salt concentration), predicted water diffusion. Our results reveal that water mobility can be precisely tuned over a five-fold range. Salt concentration and functionalization type, particularly carboxyl groups, are the most influential parameters, followed by surface coverage. Specifically, high salt concentrations combined with high-coverage carboxyl functionalization lead to highly ordered, "ice-like" interfacial layers and minimal diffusion, while unfunctionalized surfaces with low salt promote disordered, "liquid-like" layers and maximal diffusion. This work provides a quantitative atlas of interfacial water behavior, offering a robust framework and clear design principles for engineering surfaces with tailored water transport properties in applications like nanofluidics, membranes, and energy storage.