Skip to content

jwildenhain/molclass

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MolClass version 2.0.0

MolClass contains relevant pharmacological and physiological models to evaluate the performance of candidates in small molecule high throughput screens. Further it can build supervised machine learning models from small molecule datasets. It uses structural features and chemical properties identified in hit and non-hit molecule populations. It supports binary and multi class models. However the histogram display of models only displays two classes. We are planning to add regression models in the next release.

Folder Structure

build - contains the compiled MolClass Java classes
dist - contains the MolClass.jar and dependencies needed to run MolClass from command line
html/molclass/api - contains the unified FastAPI REST service in Python 3 (no authentication required locally; uploaded SDF files are stored in `uploads/`)
html/molclass/tools - tools that update and maintain the MolClass MySQL database
html/molclass/web - php5/pear webapplication (running on Ubuntu 14.04 LTS)
src - contains the Java source code for MolClass 
uploads - temporary storage for uploaded SDF files (currently retained)
lib - dependencies for MolClass version 2.0.0
nbproject - the Netbeans project configuration.

Update June 2026 (version 2.0.0)

  • Library Upgrades: Upgraded core chemistry and machine learning dependencies:
    • Chemistry Development Kit (CDK) upgraded from version 1.4 to cdk-2.12.jar.
    • Weka machine learning library upgraded from 3.6 to weka-stable-3.8.6.jar.
  • API Modernization: Consolidated the PHP Slim and Python Flask REST APIs into a single, high-performance Python FastAPI service with SQLAlchemy connection pooling, request-scoped sessions, Pydantic response validation, and automated Swagger OpenAPI documentation.
  • Architecture Modernization (MolClass v2 UI):
    • Developed a standalone Spring Boot REST API (spring_boot_predictor) that handles job queuing and interfaces with the Java Weka pipeline.
    • Built a modern, responsive Next.js React Frontend to replace the legacy PHP/Pear web application. It includes real-time tables, structure rendering, and configuration forms.
  • Machine Learning Enhancements:
    • Implemented dynamic algorithm selection for feature optimization in ModelBuilder, adding native support for Weka's ReliefFAttributeEval (Relief-F) to drastically reduce dimensionality while remaining robust to noise. Legacy correlation-based (CfsSubsetEval) and None bypassing are dynamically selectable.
    • Replaced legacy $O(N)$ index-unfriendly database query patterns with a highly optimized, index-covered UNION lookup for molecules, accelerating compound searches by ~20,000x.
    • Added XML configuration cache to eliminate redundant disk I/O.
  • Multithreading: Rewrote fingerprinters and similarity calculators to support thread-safe parallel calculations utilizing thread-local database connections and configurable thread pools.
  • Verification: Built a comprehensive automated JUnit test suite validating all 15 classifier schemes, descriptors, scaffolds, and parallel pipelines.

Update May 2019 (version 1.71)

  • MolClass is going to be moved to chemgrid.org/molclass as the original servers have been taken down due to age related instability.
  • Because of memory limitations on chemgrid.org, model building will be restricted to libraries with up to thousand molecules.
  • The instruction on how to install, run and use MolClass are being moved to the Github Wiki. https://github.com/jwildenhain/molclass/wiki/1.-MolClass-Wiki
  • A virtual machine with MolClass is available on request.
  • You can install the MolClass database and the FLASK REST service to access the data using R.
  • MolClass will get a supporting R package to use the current data models to design and benchmark your own activity predictions

About

Molecule Classification and Activity Prediction Portal - Machine Learning and Cheminformatics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors