lamindb.models¶
Models library.
Feature and label managers¶
- class lamindb.models.FeatureManager(host)¶
Feature manager.
- property slots: dict[str, Schema]¶
Features by schema slot.
Example:
artifact.features.slots #> {'var': <Schema: var>, 'obs': <Schema: obs>}
- describe(return_str=False)¶
Pretty print features.
This is what
artifact.describe()calls under the hood.- Return type:
str|None
- get_values(external_only=False)¶
Get features as a dictionary.
Includes annotation with internal and external feature values.
- Parameters:
external_only (
bool, default:False) – IfTrue, only return external feature annotations.- Return type:
dict[str,Any]
- add_values(values, feature_field=FieldAttr(Feature.name), schema=None)¶
Add values for features.
- Parameters:
values (
dict[str,str|int|float|bool]) – A dictionary of keys (features) & values (labels, strings, numbers, booleans, datetimes, etc.). If a value isNone, it will be skipped.feature_field (
DeferredAttribute, default:FieldAttr(Feature.name)) – The field of a registry to map the keys of thevaluesdictionary.schema (
Schema, default:None) – Schema to validate against.
- Return type:
None
- set_values(values, feature_field=FieldAttr(Feature.name), schema=None)¶
Set values for features.
Like
add_values, but first removes all existing external feature annotations.- Parameters:
values (
dict[str,str|int|float|bool]) – A dictionary of keys (features) & values (labels, strings, numbers, booleans, datetimes, etc.). If a value isNone, it will be skipped.feature_field (
DeferredAttribute, default:FieldAttr(Feature.name)) – The field of a registry to map the keys of thevaluesdictionary.schema (
Schema, default:None) – Schema to validate against.
- Return type:
None
- remove_values(feature=None, *, value=None)¶
Remove values for features.
- Parameters:
- Return type:
None
- class lamindb.models.LabelManager(host)¶
Label manager.
This allows to manage untyped labels
ULabeland arbitrary typed labels (e.g.,CellLine) and associate labels with features.- describe(return_str=True)¶
Describe the labels.
- Return type:
str
- add(records, feature=None)¶
Add one or several labels and associate them with a feature.
- get(feature, mute=False, flat_names=False)¶
Get labels given a feature.
- add_from(data, transfer_logs=None)¶
Add labels from an artifact or collection to another artifact or collection.
- Return type:
None
Examples
artifact1 = ln.Artifact(pd.DataFrame(index=[0, 1])).save() artifact2 = ln.Artifact(pd.DataFrame(index=[2, 3])).save() records = ln.Record.from_values(["Label1", "Label2"], field="name").save() labels = ln.Record.filter(name__icontains = "label") artifact1.records.set(labels) artifact2.labels.add_from(artifact1)
Registry base classes¶
- class lamindb.models.BaseSQLRecord(*args, **kwargs)¶
Basic metadata record.
It has the same methods as SQLRecord, but doesn’t have the additional fields.
It’s mainly used for IsLinks and similar.
- classmethod filter(*queries, **expressions)¶
Query records.
- Parameters:
queries – One or multiple
Qobjects.expressions – Fields and values passed as Django query expressions.
- Return type:
See also
Guide: Query & search registries
Django documentation: Queries
Examples
>>> ln.Project(name="my label").save() >>> ln.Project.filter(name__startswith="my").to_dataframe()
- classmethod get(idlike=None, **expressions)¶
Get a single record.
- Parameters:
idlike (
int|str|None, default:None) – Either a uid stub, uid or an integer id.expressions – Fields and values passed as Django query expressions.
- Raises:
lamindb.errors.DoesNotExist – In case no matching record is found.
- Return type:
See also
Guide: Query & search registries
Django documentation: Queries
Examples
record = ln.Record.get("FvtpPJLJ") record = ln.Record.get(name="my-label")
- classmethod to_dataframe(include=None, features=False, limit=100)¶
Evaluate and convert to
pd.DataFrame.By default, maps simple fields and foreign keys onto
DataFramecolumns.Guide: Query & search registries
- Parameters:
include (
str|list[str] |None, default:None) – Related data to include as columns. Takes strings of form"records__name","cell_types__name", etc. or a list of such strings. ForArtifact,Record, andRun, can also pass"features"to include features with data types pointing to entities in the core schema. If"privates", includes private fields (fields starting with_).features (
bool|list[str], default:False) – Configure the features to include. Can be a feature name or a list of such names. If"queryset", infers the features used within the current queryset. Only available forArtifact,Record, andRun.limit (
int, default:100) – Maximum number of rows to display. IfNone, includes all results.order_by – Field name to order the records by. Prefix with ‘-’ for descending order. Defaults to ‘-id’ to get the most recent records. This argument is ignored if the queryset is already ordered or if the specified field does not exist.
- Return type:
DataFrame
Examples
Include the name of the creator:
ln.Record.to_dataframe(include="created_by__name"])
Include features:
ln.Artifact.to_dataframe(include="features")
Include selected features:
ln.Artifact.to_dataframe(features=["cell_type_by_expert", "cell_type_by_model"])
- classmethod search(string, *, field=None, limit=20, case_sensitive=False)¶
Search.
- Parameters:
string (
str) – The input string to match against the field ontology values.field (
str|DeferredAttribute|None, default:None) – The field or fields to search. Search all string fields by default.limit (
int|None, default:20) – Maximum amount of top results to return.case_sensitive (
bool, default:False) – Whether the match is case sensitive.
- Return type:
- Returns:
A sorted
DataFrameof search results with a score in columnscore. Ifreturn_querysetisTrue.QuerySet.
Examples
records = ln.Record.from_values(["Label1", "Label2", "Label3"], field="name").save() ln.Record.search("Label2")
- classmethod lookup(field=None, return_field=None)¶
Return an auto-complete object for a field.
- Parameters:
field (
str|DeferredAttribute|None, default:None) – The field to look up the values for. Defaults to first string field.return_field (
str|DeferredAttribute|None, default:None) – The field to return. IfNone, returns the whole record.keep – When multiple records are found for a lookup, how to return the records. -
"first": return the first record. -"last": return the last record. -False: return all records.
- Return type:
NamedTuple- Returns:
A
NamedTupleof lookup information of the field values with a dictionary converter.
See also
Examples
Lookup via auto-complete on
.:import bionty as bt bt.Gene.from_source(symbol="ADGB-DT").save() lookup = bt.Gene.lookup() lookup.adgb_dt
Look up via auto-complete in dictionary:
lookup_dict = lookup.dict() lookup_dict['ADGB-DT']
Look up via a specific field:
lookup_by_ensembl_id = bt.Gene.lookup(field="ensembl_gene_id") genes.ensg00000002745
Return a specific field value instead of the full record:
lookup_return_symbols = bt.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
- classmethod connect(instance)¶
Query a non-default LaminDB instance.
- Parameters:
instance (
str|None) – An instance identifier of form “account_handle/instance_name”.- Return type:
Examples
ln.Record.connect("account_handle/instance_name").search("label7", field="name")
- save(*args, **kwargs)¶
Save.
Always saves to the default database.
- Return type:
TypeVar(T, bound= SQLRecord)
- delete(permanent=None)¶
Delete.
- Parameters:
permanent (
bool|None, default:None) – For consistency,Falseraises an error, as soft delete is impossible.- Return type:
None
- refresh_from_db(using=None, fields=None, from_queryset=None)¶
Reload field values from the database.
By default, the reloading happens from the database this instance was loaded from, or by the read router if this instance wasn’t loaded from any database. The using parameter will override the default.
Fields can be used to specify which fields to reload. The fields should be an iterable of field attnames. If fields is None, then all non-deferred fields are reloaded.
When accessing deferred fields of an instance, the deferred loading of the field will call this method.
- async arefresh_from_db(using=None, fields=None, from_queryset=None)¶
- class lamindb.models.SQLRecord(*args, **kwargs)¶
Metadata record.
Every
SQLRecordis a data model that comes with a registry in form of a SQL table in your database.Sub-classing
SQLRecordcreates a new registry while instantiating aSQLRecordcreates a new record.Example:
from lamindb import SQLRecord, fields # sub-classing `SQLRecord` creates a new registry class Experiment(SQLRecord): name: str = fields.CharField() # instantiating `Experiment` creates a record `experiment` experiment = Experiment(name="my experiment") # you can save the record to the database experiment.save() # `Experiment` refers to the registry, which you can query df = Experiment.filter(name__startswith="my ").to_dataframe()
SQLRecord’s metaclass isRegistry.SQLRecordinherits from Django’sModelclass. Why does LaminDB call itSQLRecordand notModel? The termSQLRecordcan’t lead to confusion with statistical, machine learning or biological models.-
is_locked:
bool¶ Whether the record is locked for edits.
-
branch:
Branch¶ Life cycle state of record.
branch.namecan be “main” (default branch), “trash” (trash),branch.name = "archive"(archived), or any other user-created branch typically planned for merging onto main after review.
- classmethod filter(*queries, **expressions)¶
Query records.
- Parameters:
queries – One or multiple
Qobjects.expressions – Fields and values passed as Django query expressions.
- Return type:
See also
Guide: Query & search registries
Django documentation: Queries
Examples
>>> ln.Project(name="my label").save() >>> ln.Project.filter(name__startswith="my").to_dataframe()
- classmethod get(idlike=None, **expressions)¶
Get a single record.
- Parameters:
idlike (
int|str|None, default:None) – Either a uid stub, uid or an integer id.expressions – Fields and values passed as Django query expressions.
- Raises:
lamindb.errors.DoesNotExist – In case no matching record is found.
- Return type:
See also
Guide: Query & search registries
Django documentation: Queries
Examples
record = ln.Record.get("FvtpPJLJ") record = ln.Record.get(name="my-label")
- classmethod to_dataframe(include=None, features=False, limit=100)¶
Evaluate and convert to
pd.DataFrame.By default, maps simple fields and foreign keys onto
DataFramecolumns.Guide: Query & search registries
- Parameters:
include (
str|list[str] |None, default:None) – Related data to include as columns. Takes strings of form"records__name","cell_types__name", etc. or a list of such strings. ForArtifact,Record, andRun, can also pass"features"to include features with data types pointing to entities in the core schema. If"privates", includes private fields (fields starting with_).features (
bool|list[str], default:False) – Configure the features to include. Can be a feature name or a list of such names. If"queryset", infers the features used within the current queryset. Only available forArtifact,Record, andRun.limit (
int, default:100) – Maximum number of rows to display. IfNone, includes all results.order_by – Field name to order the records by. Prefix with ‘-’ for descending order. Defaults to ‘-id’ to get the most recent records. This argument is ignored if the queryset is already ordered or if the specified field does not exist.
- Return type:
DataFrame
Examples
Include the name of the creator:
ln.Record.to_dataframe(include="created_by__name"])
Include features:
ln.Artifact.to_dataframe(include="features")
Include selected features:
ln.Artifact.to_dataframe(features=["cell_type_by_expert", "cell_type_by_model"])
- classmethod search(string, *, field=None, limit=20, case_sensitive=False)¶
Search.
- Parameters:
string (
str) – The input string to match against the field ontology values.field (
str|DeferredAttribute|None, default:None) – The field or fields to search. Search all string fields by default.limit (
int|None, default:20) – Maximum amount of top results to return.case_sensitive (
bool, default:False) – Whether the match is case sensitive.
- Return type:
- Returns:
A sorted
DataFrameof search results with a score in columnscore. Ifreturn_querysetisTrue.QuerySet.
Examples
records = ln.Record.from_values(["Label1", "Label2", "Label3"], field="name").save() ln.Record.search("Label2")
- classmethod lookup(field=None, return_field=None)¶
Return an auto-complete object for a field.
- Parameters:
field (
str|DeferredAttribute|None, default:None) – The field to look up the values for. Defaults to first string field.return_field (
str|DeferredAttribute|None, default:None) – The field to return. IfNone, returns the whole record.keep – When multiple records are found for a lookup, how to return the records. -
"first": return the first record. -"last": return the last record. -False: return all records.
- Return type:
NamedTuple- Returns:
A
NamedTupleof lookup information of the field values with a dictionary converter.
See also
Examples
Lookup via auto-complete on
.:import bionty as bt bt.Gene.from_source(symbol="ADGB-DT").save() lookup = bt.Gene.lookup() lookup.adgb_dt
Look up via auto-complete in dictionary:
lookup_dict = lookup.dict() lookup_dict['ADGB-DT']
Look up via a specific field:
lookup_by_ensembl_id = bt.Gene.lookup(field="ensembl_gene_id") genes.ensg00000002745
Return a specific field value instead of the full record:
lookup_return_symbols = bt.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
- classmethod connect(instance)¶
Query a non-default LaminDB instance.
- Parameters:
instance (
str|None) – An instance identifier of form “account_handle/instance_name”.- Return type:
Examples
ln.Record.connect("account_handle/instance_name").search("label7", field="name")
- restore()¶
Restore from trash onto the main branch.
Does not restore descendant records if the record is
HasTypewithis_type = True.- Return type:
None
- delete(permanent=None, **kwargs)¶
Delete record.
If record is
HasTypewithis_type = True, deletes all descendant records, too.- Parameters:
permanent (
bool|None, default:None) – Whether to permanently delete the record (skips trash). IfNone, performs soft delete if the record is not already in the trash.- Return type:
None
Examples
For any
SQLRecordobjectrecord, call:>>> record.delete()
- save(*args, **kwargs)¶
Save.
Always saves to the default database.
- Return type:
TypeVar(T, bound= SQLRecord)
- refresh_from_db(using=None, fields=None, from_queryset=None)¶
Reload field values from the database.
By default, the reloading happens from the database this instance was loaded from, or by the read router if this instance wasn’t loaded from any database. The using parameter will override the default.
Fields can be used to specify which fields to reload. The fields should be an iterable of field attnames. If fields is None, then all non-deferred fields are reloaded.
When accessing deferred fields of an instance, the deferred loading of the field will call this method.
- async arefresh_from_db(using=None, fields=None, from_queryset=None)¶
-
is_locked:
- class lamindb.models.Registry(name, bases, attrs, **kwargs)¶
Metaclass for
SQLRecord.Each
Registryobject is aSQLRecordclass and corresponds to a table in the metadata SQL database.You work with
Registryobjects whenever you use class methods ofSQLRecord.You call any subclass of
SQLRecorda “registry” and their objects “records”. ASQLRecordobject corresponds to a row in the SQL table.If you want to create a new registry, you sub-class
SQLRecord.Example:
from lamindb import SQLRecord, fields # sub-classing `SQLRecord` creates a new registry class Experiment(SQLRecord): name: str = fields.CharField() # instantiating `Experiment` creates a record `experiment` experiment = Experiment(name="my experiment") # you can save the record to the database experiment.save() # `Experiment` refers to the registry, which you can query df = Experiment.filter(name__startswith="my ").to_dataframe()
Note:
Registryinherits from Django’sModelBase.- lookup(field=None, return_field=None, keep='first')¶
Return an auto-complete object for a field.
- Parameters:
field (
str|DeferredAttribute|None, default:None) – The field to look up the values for. Defaults to first string field.return_field (
str|DeferredAttribute|None, default:None) – The field to return. IfNone, returns the whole record.keep (
Literal['first','last',False], default:'first') – When multiple records are found for a lookup, how to return the records. -"first": return the first record. -"last": return the last record. -False: return all records.
- Return type:
NamedTuple- Returns:
A
NamedTupleof lookup information of the field values with a dictionary converter.
See also
Examples
Lookup via auto-complete on
.:import bionty as bt bt.Gene.from_source(symbol="ADGB-DT").save() lookup = bt.Gene.lookup() lookup.adgb_dt
Look up via auto-complete in dictionary:
lookup_dict = lookup.dict() lookup_dict['ADGB-DT']
Look up via a specific field:
lookup_by_ensembl_id = bt.Gene.lookup(field="ensembl_gene_id") genes.ensg00000002745
Return a specific field value instead of the full record:
lookup_return_symbols = bt.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
- filter(*queries, **expressions)¶
Query records.
- Parameters:
queries – One or multiple
Qobjects.expressions – Fields and values passed as Django query expressions.
- Return type:
QuerySet
See also
Guide: Query & search registries
Django documentation: Queries
Examples
>>> ln.Project(name="my label").save() >>> ln.Project.filter(name__startswith="my").to_dataframe()
- get(idlike=None, **expressions)¶
Get a single record.
- Parameters:
idlike (
int|str|None, default:None) – Either a uid stub, uid or an integer id.expressions – Fields and values passed as Django query expressions.
- Raises:
lamindb.errors.DoesNotExist – In case no matching record is found.
- Return type:
TypeVar(T, bound= SQLRecord)
See also
Guide: Query & search registries
Django documentation: Queries
Examples
record = ln.Record.get("FvtpPJLJ") record = ln.Record.get(name="my-label")
- to_dataframe(*, include=None, features=None, limit=100, order_by='-id')¶
Evaluate and convert to
pd.DataFrame.By default, maps simple fields and foreign keys onto
DataFramecolumns.Guide: Query & search registries
- Parameters:
include (
str|list[str] |None, default:None) – Related data to include as columns. Takes strings of form"records__name","cell_types__name", etc. or a list of such strings. ForArtifact,Record, andRun, can also pass"features"to include features with data types pointing to entities in the core schema. If"privates", includes private fields (fields starting with_).features (
str|list[str] |None, default:None) – Configure the features to include. Can be a feature name or a list of such names. If"queryset", infers the features used within the current queryset. Only available forArtifact,Record, andRun.limit (
int|None, default:100) – Maximum number of rows to display. IfNone, includes all results.order_by (
str|None, default:'-id') – Field name to order the records by. Prefix with ‘-’ for descending order. Defaults to ‘-id’ to get the most recent records. This argument is ignored if the queryset is already ordered or if the specified field does not exist.
- Return type:
DataFrame
Examples
Include the name of the creator:
ln.Record.to_dataframe(include="created_by__name"])
Include features:
ln.Artifact.to_dataframe(include="features")
Include selected features:
ln.Artifact.to_dataframe(features=["cell_type_by_expert", "cell_type_by_model"])
- search(string, *, field=None, limit=20, case_sensitive=False)¶
Search.
- Parameters:
string (
str) – The input string to match against the field ontology values.field (
str|DeferredAttribute|None, default:None) – The field or fields to search. Search all string fields by default.limit (
int|None, default:20) – Maximum amount of top results to return.case_sensitive (
bool, default:False) – Whether the match is case sensitive.
- Return type:
QuerySet- Returns:
A sorted
DataFrameof search results with a score in columnscore. Ifreturn_querysetisTrue.QuerySet.
Examples
records = ln.Record.from_values(["Label1", "Label2", "Label3"], field="name").save() ln.Record.search("Label2")
- connect(instance)¶
Query a non-default LaminDB instance.
- Parameters:
instance (
str|None) – An instance identifier of form “account_handle/instance_name”.- Return type:
QuerySet
Examples
ln.Record.connect("account_handle/instance_name").search("label7", field="name")
Mixins for registries¶
- class lamindb.models.IsVersioned¶
- class lamindb.models.IsVersioned(*db_args)
Base class for versioned models.
- Meta = <class 'lamindb.models._is_versioned.IsVersioned.Meta'>¶
- property pk¶
- property stem_uid: str¶
Universal id characterizing the version family.
The full uid of a record is obtained via concatenating the stem uid and version information:
stem_uid = random_base62(n_char) # a random base62 sequence of length 12 (transform) or 16 (artifact, collection) version_uid = "0000" # an auto-incrementing 4-digit base62 number uid = f"{stem_uid}{version_uid}" # concatenate the stem_uid & version_uid
- property versions: QuerySet¶
Lists all records of the same version family.
>>> new_artifact = ln.Artifact(df2, revises=artifact).save() >>> new_artifact.versions()
- refresh_from_db(using=None, fields=None, from_queryset=None)¶
Reload field values from the database.
By default, the reloading happens from the database this instance was loaded from, or by the read router if this instance wasn’t loaded from any database. The using parameter will override the default.
Fields can be used to specify which fields to reload. The fields should be an iterable of field attnames. If fields is None, then all non-deferred fields are reloaded.
When accessing deferred fields of an instance, the deferred loading of the field will call this method.
- async arefresh_from_db(using=None, fields=None, from_queryset=None)¶
- save(*args, force_insert=False, force_update=False, using=None, update_fields=None)¶
Save the current instance. Override this in a subclass if you want to control the saving process.
The ‘force_insert’ and ‘force_update’ parameters can be used to insist that the “save” must be an SQL insert or update (or equivalent for non-SQL backends), respectively. Normally, they should not be set.
- delete(using=None, keep_parents=False)¶
- class lamindb.models.HasType¶
Mixin for registries that have a hierarchical
typeassigned.Such registries have a
.typeforeign key pointing to themselves.A
typehence allows hierarchically grouping records under types.For instance, using the example of
ln.Record:experiment_type = ln.Record(name="Experiment", is_type=True).save() experiment1 = ln.Record(name="Experiment 1", type=experiment_type).save() experiment2 = ln.Record(name="Experiment 2", type=experiment_type).save()
- query_types()¶
Query types of a record recursively.
While
.typeretrieves thetype, this method retrieves all super types of thattype:# Create type hierarchy type1 = model_class(name="Type1", is_type=True).save() type2 = model_class(name="Type2", is_type=True, type=type1).save() type3 = model_class(name="Type3", is_type=True, type=type2).save() # Create a record with type3 record = model_class(name=f"{model_name}3", type=type3).save() # Query super types super_types = record.query_types() assert super_types[0] == type3 assert super_types[1] == type2 assert super_types[2] == type1
- Return type:
- class lamindb.models.HasParents¶
Base class for hierarchical registries (ontologies).
- view_parents(field=None, with_children=False, distance=5)¶
View parents in an ontology.
- Parameters:
field (
str|DeferredAttribute|None, default:None) – Field to display on graphwith_children (
bool, default:False) – Whether to also show children.distance (
int, default:5) – Maximum distance still shown.
Ontological hierarchies:
ULabel(project & sub-project),CellType(cell type & subtype).Examples
>>> import bionty as bt >>> bt.Tissue.from_source(name="subsegmental bronchus").save() >>> record = bt.Tissue.get(name="respiratory tube") >>> record.view_parents() >>> tissue.view_parents(with_children=True)
- view_children(field=None, distance=5)¶
View children in an ontology.
- Parameters:
field (
str|DeferredAttribute|None, default:None) – Field to display on graphdistance (
int, default:5) – Maximum distance still shown.
Ontological hierarchies:
ULabel(project & sub-project),CellType(cell type & subtype).Examples
>>> import bionty as bt >>> bt.Tissue.from_source(name="subsegmental bronchus").save() >>> record = bt.Tissue.get(name="respiratory tube") >>> record.view_parents() >>> tissue.view_parents(with_children=True)
- class lamindb.models.CanCurate¶
Base class providing
SQLRecord-based validation.- classmethod inspect(values, field=None, *, mute=False, organism=None, source=None, from_source=True, strict_source=False)¶
Inspect if values are mappable to a field.
Being mappable means that an exact match exists.
- Parameters:
values (
list[str] |Series|array) – Values that will be checked against the field.field (
str|DeferredAttribute|None, default:None) – The field of values. Examples are'ontology_id'to map against the source ID or'name'to map against the ontologies field names.mute (
bool, default:False) – Whether to mute logging.organism (
str|SQLRecord|None, default:None) – An Organism name or record.source (
SQLRecord|None, default:None) – Abionty.Sourcerecord that specifies the version to inspect against.strict_source (
bool, default:False) – Determines the validation behavior against records in the registry. - IfFalse, validation will include all records in the registry, ignoring the specified source. - IfTrue, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.
- Return type:
bionty.base.dev.InspectResult
See also
Example:
import bionty as bt # save some gene records bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save() # inspect gene symbols gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"] result = bt.Gene.inspect(gene_symbols, field=bt.Gene.symbol, organism="human") assert result.validated == ["A1CF", "A1BG"] assert result.non_validated == ["FANCD1", "FANCD20"]
- classmethod validate(values, field=None, *, mute=False, organism=None, source=None, strict_source=False)¶
Validate values against existing values of a string field.
Note this is strict_source validation, only asserts exact matches.
- Parameters:
values (
list[str] |Series|array) – Values that will be validated against the field.field (
str|DeferredAttribute|None, default:None) – The field of values. Examples are'ontology_id'to map against the source ID or'name'to map against the ontologies field names.mute (
bool, default:False) – Whether to mute logging.organism (
str|SQLRecord|None, default:None) – An Organism name or record.source (
SQLRecord|None, default:None) – Abionty.Sourcerecord that specifies the version to validate against.strict_source (
bool, default:False) – Determines the validation behavior against records in the registry. - IfFalse, validation will include all records in the registry, ignoring the specified source. - IfTrue, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.
- Return type:
ndarray- Returns:
A vector of booleans indicating if an element is validated.
See also
Example:
import bionty as bt bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save() gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"] bt.Gene.validate(gene_symbols, field=bt.Gene.symbol, organism="human") #> array([ True, True, False, False])
- classmethod from_values(values, field=None, create=False, organism=None, source=None, mute=False)¶
Bulk create validated records by parsing values for an identifier such as a name or an id).
- Parameters:
values (
list[str] |Series|array) – A list of values for an identifier, e.g.["name1", "name2"].field (
str|DeferredAttribute|None, default:None) – ASQLRecordfield to look up, e.g.,bt.CellMarker.name.create (
bool, default:False) – Whether to create records if they don’t exist.organism (
SQLRecord|str|None, default:None) – Abionty.Organismname or record.source (
SQLRecord|None, default:None) – Abionty.Sourcerecord to validate against to create records for.mute (
bool, default:False) – Whether to mute logging.
- Return type:
- Returns:
A list of validated records. For bionty registries. Also returns knowledge-coupled records.
Notes
For more info, see tutorial: Manage biological ontologies.
Example:
import bionty as bt # Bulk create from non-validated values will log warnings & returns empty list ulabels = ln.ULabel.from_values(["benchmark", "prediction", "test"]) assert len(ulabels) == 0 # Bulk create records from validated values returns the corresponding existing records ulabels = ln.ULabel.from_values(["benchmark", "prediction", "test"], create=True).save() assert len(ulabels) == 3 # Bulk create records from public reference bt.CellType.from_values(["T cell", "B cell"]).save()
- classmethod standardize(values, field=None, *, return_field=None, return_mapper=False, case_sensitive=False, mute=False, source_aware=True, keep='first', synonyms_field='synonyms', organism=None, source=None, strict_source=False)¶
Maps input synonyms to standardized names.
- Parameters:
values (
Iterable) – Identifiers that will be standardized.field (
str|DeferredAttribute|None, default:None) – The field representing the standardized names.return_field (
str|DeferredAttribute|None, default:None) – The field to return. Defaults to field.return_mapper (
bool, default:False) – IfTrue, returns{input_value: standardized_name}.case_sensitive (
bool, default:False) – Whether the mapping is case sensitive.mute (
bool, default:False) – Whether to mute logging.source_aware (
bool, default:True) – Whether to standardize from public source. Defaults toTruefor BioRecord registries.keep (
Literal['first','last',False], default:'first') –When a synonym maps to multiple names, determines which duplicates to mark as
pd.DataFrame.duplicated: -"first": returns the first mapped standardized name -"last": returns the last mapped standardized name -False: returns all mapped standardized name.When
keepisFalse, the returned list of standardized names will contain nested lists in case of duplicates.When a field is converted into return_field, keep marks which matches to keep when multiple return_field values map to the same field value.
synonyms_field (
str, default:'synonyms') – A field containing the concatenated synonyms.organism (
str|SQLRecord|None, default:None) – An Organism name or record.source (
SQLRecord|None, default:None) – Abionty.Sourcerecord that specifies the version to validate against.strict_source (
bool, default:False) – Determines the validation behavior against records in the registry. - IfFalse, validation will include all records in the registry, ignoring the specified source. - IfTrue, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.
- Return type:
list[str] |dict[str,str]- Returns:
If
return_mapperisFalse– a list of standardized names. Otherwise, a dictionary of mapped values with mappable synonyms as keys and standardized names as values.
See also
add_synonym()Add synonyms.
remove_synonym()Remove synonyms.
Example:
import bionty as bt # save some gene records bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save() # standardize gene synonyms gene_synonyms = ["A1CF", "A1BG", "FANCD1", "FANCD20"] bt.Gene.standardize(gene_synonyms) #> ['A1CF', 'A1BG', 'BRCA2', 'FANCD20']
- add_synonym(synonym, force=False, save=None)¶
Add synonyms to a record.
- Parameters:
synonym (
str|list[str] |Series|array) – The synonyms to add to the record.force (
bool, default:False) – Whether to add synonyms even if they are already synonyms of other records.save (
bool|None, default:None) – Whether to save the record to the database.
See also
remove_synonym()Remove synonyms.
Example:
import bionty as bt # save "T cell" record record = bt.CellType.from_source(name="T cell").save() record.synonyms #> "T-cell|T lymphocyte|T-lymphocyte" # add a synonym record.add_synonym("T cells") record.synonyms #> "T cells|T-cell|T-lymphocyte|T lymphocyte"
- remove_synonym(synonym)¶
Remove synonyms from a record.
- Parameters:
synonym (
str|list[str] |Series|array) – The synonym values to remove.
See also
add_synonym()Add synonyms
Example:
import bionty as bt # save "T cell" record record = bt.CellType.from_source(name="T cell").save() record.synonyms #> "T-cell|T lymphocyte|T-lymphocyte" # remove a synonym record.remove_synonym("T-cell") record.synonyms #> "T lymphocyte|T-lymphocyte"
- set_abbr(value)¶
Set value for abbr field and add to synonyms.
- Parameters:
value (
str) – A value for an abbreviation.
See also
Example:
import bionty as bt # save an experimental factor record scrna = bt.ExperimentalFactor.from_source(name="single-cell RNA sequencing").save() assert scrna.abbr is None assert scrna.synonyms == "single-cell RNA-seq|single-cell transcriptome sequencing|scRNA-seq|single cell RNA sequencing" # set abbreviation scrna.set_abbr("scRNA") assert scrna.abbr == "scRNA" # synonyms are updated assert scrna.synonyms == "scRNA|single-cell RNA-seq|single cell RNA sequencing|single-cell transcriptome sequencing|scRNA-seq"
- class lamindb.models.TracksRun¶
- class lamindb.models.TracksRun(*db_args)
Base class tracking latest run, creating user, and
created_attimestamp.- Meta = <class 'lamindb.models.run.TracksRun.Meta'>¶
- created_by_id¶
- property pk¶
- run_id¶
- refresh_from_db(using=None, fields=None, from_queryset=None)¶
Reload field values from the database.
By default, the reloading happens from the database this instance was loaded from, or by the read router if this instance wasn’t loaded from any database. The using parameter will override the default.
Fields can be used to specify which fields to reload. The fields should be an iterable of field attnames. If fields is None, then all non-deferred fields are reloaded.
When accessing deferred fields of an instance, the deferred loading of the field will call this method.
- async arefresh_from_db(using=None, fields=None, from_queryset=None)¶
- save(*args, force_insert=False, force_update=False, using=None, update_fields=None)¶
Save the current instance. Override this in a subclass if you want to control the saving process.
The ‘force_insert’ and ‘force_update’ parameters can be used to insist that the “save” must be an SQL insert or update (or equivalent for non-SQL backends), respectively. Normally, they should not be set.
- delete(using=None, keep_parents=False)¶
- class lamindb.models.TracksUpdates¶
- class lamindb.models.TracksUpdates(*db_args)
Base class tracking previous runs and
updated_attimestamp.- Meta = <class 'lamindb.models.run.TracksUpdates.Meta'>¶
- property pk¶
- refresh_from_db(using=None, fields=None, from_queryset=None)¶
Reload field values from the database.
By default, the reloading happens from the database this instance was loaded from, or by the read router if this instance wasn’t loaded from any database. The using parameter will override the default.
Fields can be used to specify which fields to reload. The fields should be an iterable of field attnames. If fields is None, then all non-deferred fields are reloaded.
When accessing deferred fields of an instance, the deferred loading of the field will call this method.
- async arefresh_from_db(using=None, fields=None, from_queryset=None)¶
- save(*args, force_insert=False, force_update=False, using=None, update_fields=None)¶
Save the current instance. Override this in a subclass if you want to control the saving process.
The ‘force_insert’ and ‘force_update’ parameters can be used to insist that the “save” must be an SQL insert or update (or equivalent for non-SQL backends), respectively. Normally, they should not be set.
- delete(using=None, keep_parents=False)¶
Query sets & managers¶
- class lamindb.models.BasicQuerySet(model=None, query=None, using=None, hints=None)¶
Sets of records returned by queries.
See also
Examples
Any filter statement produces a query set:
queryset = Registry.filter(name__startswith="keyword")
- property db¶
Return the database used if this query is executed now.
- property ordered¶
Return True if the QuerySet is ordered – i.e. has an order_by() clause or a default ordering on the model (or is empty).
- property query¶
- classmethod as_manager()¶
- to_dataframe(*, include=None, features=None, limit=100, order_by='-id')¶
Evaluate and convert to
pd.DataFrame.By default, maps simple fields and foreign keys onto
DataFramecolumns.Guide: Query & search registries
- Parameters:
include (
str|list[str] |None, default:None) – Related data to include as columns. Takes strings of form"records__name","cell_types__name", etc. or a list of such strings. ForArtifact,Record, andRun, can also pass"features"to include features with data types pointing to entities in the core schema. If"privates", includes private fields (fields starting with_).features (
str|list[str] |None, default:None) – Configure the features to include. Can be a feature name or a list of such names. If"queryset", infers the features used within the current queryset. Only available forArtifact,Record, andRun.limit (
int|None, default:100) – Maximum number of rows to display. IfNone, includes all results.order_by (
str|None, default:'-id') – Field name to order the records by. Prefix with ‘-’ for descending order. Defaults to ‘-id’ to get the most recent records. This argument is ignored if the queryset is already ordered or if the specified field does not exist.
- Return type:
DataFrame
Examples
Include the name of the creator:
ln.Record.to_dataframe(include="created_by__name"])
Include features:
ln.Artifact.to_dataframe(include="features")
Include selected features:
ln.Artifact.to_dataframe(features=["cell_type_by_expert", "cell_type_by_model"])
- delete(*args, permanent=None, **kwargs)¶
Delete all records in the query set.
- Parameters:
permanent (
bool|None, default:None) – Whether to permanently delete the record (skips trash). Is only relevant for records that have thebranchfield. IfNone, uses soft delete for records that have thebranchfield, hard delete otherwise.
Note
Calling
delete()twice on the same queryset does NOT permanently delete in bulk operations. Usepermanent=Truefor actual deletion.Examples
For any
QuerySetobjectqs, call:>>> qs.delete()
- to_list(field=None)¶
Populate an (unordered) list with the results.
Note that the order in this list is only meaningful if you ordered the underlying query set with
.order_by().- Return type:
list[SQLRecord] |list[str]
Examples
>>> queryset.to_list() # list of records >>> queryset.to_list("name") # list of values
- first()¶
If non-empty, the first result in the query set, otherwise
None.- Return type:
SQLRecord|None
Examples
>>> queryset.first()
- one_or_none()¶
At most one result. Returns it if there is one, otherwise returns
None.- Return type:
SQLRecord|None
Examples
>>> ULabel.filter(name="benchmark").one_or_none() >>> ULabel.filter(name="non existing label").one_or_none()
- search(string, **kwargs)¶
Search.
- Parameters:
string (
str) – The input string to match against the field ontology values.field – The field or fields to search. Search all string fields by default.
limit – Maximum amount of top results to return.
case_sensitive – Whether the match is case sensitive.
- Returns:
A sorted
DataFrameof search results with a score in columnscore. Ifreturn_querysetisTrue.QuerySet.
Examples
records = ln.Record.from_values(["Label1", "Label2", "Label3"], field="name").save() ln.Record.search("Label2")
- lookup(field=None, **kwargs)¶
Return an auto-complete object for a field.
- Parameters:
field (
str|DeferredAttribute|None, default:None) – The field to look up the values for. Defaults to first string field.return_field – The field to return. If
None, returns the whole record.keep – When multiple records are found for a lookup, how to return the records. -
"first": return the first record. -"last": return the last record. -False: return all records.
- Return type:
NamedTuple- Returns:
A
NamedTupleof lookup information of the field values with a dictionary converter.
See also
Examples
Lookup via auto-complete on
.:import bionty as bt bt.Gene.from_source(symbol="ADGB-DT").save() lookup = bt.Gene.lookup() lookup.adgb_dt
Look up via auto-complete in dictionary:
lookup_dict = lookup.dict() lookup_dict['ADGB-DT']
Look up via a specific field:
lookup_by_ensembl_id = bt.Gene.lookup(field="ensembl_gene_id") genes.ensg00000002745
Return a specific field value instead of the full record:
lookup_return_symbols = bt.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
- validate(values, field=None, **kwargs)¶
Validate values against existing values of a string field.
Note this is strict_source validation, only asserts exact matches.
- Parameters:
values (
list[str] |Series|array) – Values that will be validated against the field.field (
str|DeferredAttribute|None, default:None) – The field of values. Examples are'ontology_id'to map against the source ID or'name'to map against the ontologies field names.mute – Whether to mute logging.
organism – An Organism name or record.
source – A
bionty.Sourcerecord that specifies the version to validate against.strict_source – Determines the validation behavior against records in the registry. - If
False, validation will include all records in the registry, ignoring the specified source. - IfTrue, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.
- Returns:
A vector of booleans indicating if an element is validated.
See also
Example:
import bionty as bt bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save() gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"] bt.Gene.validate(gene_symbols, field=bt.Gene.symbol, organism="human") #> array([ True, True, False, False])
- inspect(values, field=None, **kwargs)¶
Inspect if values are mappable to a field.
Being mappable means that an exact match exists.
- Parameters:
values (
list[str] |Series|array) – Values that will be checked against the field.field (
str|DeferredAttribute|None, default:None) – The field of values. Examples are'ontology_id'to map against the source ID or'name'to map against the ontologies field names.mute – Whether to mute logging.
organism – An Organism name or record.
source – A
bionty.Sourcerecord that specifies the version to inspect against.strict_source – Determines the validation behavior against records in the registry. - If
False, validation will include all records in the registry, ignoring the specified source. - IfTrue, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.
See also
Example:
import bionty as bt # save some gene records bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save() # inspect gene symbols gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"] result = bt.Gene.inspect(gene_symbols, field=bt.Gene.symbol, organism="human") assert result.validated == ["A1CF", "A1BG"] assert result.non_validated == ["FANCD1", "FANCD20"]
- standardize(values, field=None, **kwargs)¶
Maps input synonyms to standardized names.
- Parameters:
values (
Iterable) – Identifiers that will be standardized.field (
str|DeferredAttribute|None, default:None) – The field representing the standardized names.return_field – The field to return. Defaults to field.
return_mapper – If
True, returns{input_value: standardized_name}.case_sensitive – Whether the mapping is case sensitive.
mute – Whether to mute logging.
source_aware – Whether to standardize from public source. Defaults to
Truefor BioRecord registries.keep –
When a synonym maps to multiple names, determines which duplicates to mark as
pd.DataFrame.duplicated: -"first": returns the first mapped standardized name -"last": returns the last mapped standardized name -False: returns all mapped standardized name.When
keepisFalse, the returned list of standardized names will contain nested lists in case of duplicates.When a field is converted into return_field, keep marks which matches to keep when multiple return_field values map to the same field value.
synonyms_field – A field containing the concatenated synonyms.
organism – An Organism name or record.
source – A
bionty.Sourcerecord that specifies the version to validate against.strict_source – Determines the validation behavior against records in the registry. - If
False, validation will include all records in the registry, ignoring the specified source. - IfTrue, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.
- Returns:
If
return_mapperisFalse– a list of standardized names. Otherwise, a dictionary of mapped values with mappable synonyms as keys and standardized names as values.
See also
add_synonym()Add synonyms.
remove_synonym()Remove synonyms.
Example:
import bionty as bt # save some gene records bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save() # standardize gene synonyms gene_synonyms = ["A1CF", "A1BG", "FANCD1", "FANCD20"] bt.Gene.standardize(gene_synonyms) #> ['A1CF', 'A1BG', 'BRCA2', 'FANCD20']
- iterator(chunk_size=None)¶
An iterator over the results from applying this QuerySet to the database. chunk_size must be provided for QuerySets that prefetch related objects. Otherwise, a default chunk_size of 2000 is supplied.
- async aiterator(chunk_size=2000)¶
An asynchronous iterator over the results from applying this QuerySet to the database.
- aggregate(*args, **kwargs)¶
Return a dictionary containing the calculations (aggregation) over the current queryset.
If args is present the expression is passed as a kwarg using the Aggregate object’s default alias.
- async aaggregate(*args, **kwargs)¶
- count()¶
Perform a SELECT COUNT() and return the number of records as an integer.
If the QuerySet is already fully cached, return the length of the cached results set to avoid multiple SELECT COUNT(*) calls.
- async acount()¶
- get(*args, **kwargs)¶
Perform the query and return a single object matching the given keyword arguments.
- async aget(*args, **kwargs)¶
- create(**kwargs)¶
Create a new object with the given kwargs, saving it to the database and returning the created object.
- async acreate(**kwargs)¶
- bulk_create(objs, batch_size=None, ignore_conflicts=False, update_conflicts=False, update_fields=None, unique_fields=None)¶
Insert each of the instances into the database. Do not call save() on each of the instances, do not send any pre/post_save signals, and do not set the primary key attribute if it is an autoincrement field (except if features.can_return_rows_from_bulk_insert=True). Multi-table models are not supported.
- async abulk_create(objs, batch_size=None, ignore_conflicts=False, update_conflicts=False, update_fields=None, unique_fields=None)¶
- bulk_update(objs, fields, batch_size=None)¶
Update the given fields in each of the given objects in the database.
- async abulk_update(objs, fields, batch_size=None)¶
- get_or_create(defaults=None, **kwargs)¶
Look up an object with the given kwargs, creating one if necessary. Return a tuple of (object, created), where created is a boolean specifying whether an object was created.
- async aget_or_create(defaults=None, **kwargs)¶
- update_or_create(defaults=None, create_defaults=None, **kwargs)¶
Look up an object with the given kwargs, updating one with defaults if it exists, otherwise create a new one. Optionally, an object can be created with different values than defaults by using create_defaults. Return a tuple (object, created), where created is a boolean specifying whether an object was created.
- async aupdate_or_create(defaults=None, create_defaults=None, **kwargs)¶
- earliest(*fields)¶
- async aearliest(*fields)¶
- latest(*fields)¶
Return the latest object according to fields (if given) or by the model’s Meta.get_latest_by.
- async alatest(*fields)¶
- async afirst()¶
- last()¶
Return the last object of a query or None if no match is found.
- async alast()¶
- in_bulk(id_list=None, *, field_name='pk')¶
Return a dictionary mapping each of the given IDs to the object with that ID. If
id_listisn’t provided, evaluate the entire QuerySet.
- async ain_bulk(id_list=None, *, field_name='pk')¶
- update(**kwargs)¶
Update all elements in the current QuerySet, setting all the given fields to the appropriate values.
- async aupdate(**kwargs)¶
- exists()¶
Return True if the QuerySet would have any results, False otherwise.
- async aexists()¶
- contains(obj)¶
Return True if the QuerySet contains the provided obj, False otherwise.
- async acontains(obj)¶
- explain(*, format=None, **options)¶
Runs an EXPLAIN on the SQL query this QuerySet would perform, and returns the results.
- async aexplain(*, format=None, **options)¶
- raw(raw_query, params=(), translations=None, using=None)¶
- values(*fields, **expressions)¶
- values_list(*fields, flat=False, named=False)¶
- dates(field_name, kind, order='ASC')¶
Return a list of date objects representing all available dates for the given field_name, scoped to ‘kind’.
- datetimes(field_name, kind, order='ASC', tzinfo=None)¶
Return a list of datetime objects representing all available datetimes for the given field_name, scoped to ‘kind’.
- none()¶
Return an empty QuerySet.
- all()¶
Return a new QuerySet that is a copy of the current one. This allows a QuerySet to proxy for a model manager in some cases.
- filter(*args, **kwargs)¶
Return a new QuerySet instance with the args ANDed to the existing set.
- exclude(*args, **kwargs)¶
Return a new QuerySet instance with NOT (args) ANDed to the existing set.
- complex_filter(filter_obj)¶
Return a new QuerySet instance with filter_obj added to the filters.
filter_obj can be a Q object or a dictionary of keyword lookup arguments.
This exists to support framework features such as ‘limit_choices_to’, and usually it will be more natural to use other methods.
- union(*other_qs, all=False)¶
- intersection(*other_qs)¶
- difference(*other_qs)¶
- select_for_update(nowait=False, skip_locked=False, of=(), no_key=False)¶
Return a new QuerySet instance that will select objects with a FOR UPDATE lock.
Return a new QuerySet instance that will select related objects.
If fields are specified, they must be ForeignKey fields and only those related objects are included in the selection.
If select_related(None) is called, clear the list.
Return a new QuerySet instance that will prefetch the specified Many-To-One and Many-To-Many related objects when the QuerySet is evaluated.
When prefetch_related() is called more than once, append to the list of prefetch lookups. If prefetch_related(None) is called, clear the list.
- annotate(*args, **kwargs)¶
Return a query set in which the returned objects have been annotated with extra data or aggregations.
- alias(*args, **kwargs)¶
Return a query set with added aliases for extra data or aggregations.
- order_by(*field_names)¶
Return a new QuerySet instance with the ordering changed.
- distinct(*field_names)¶
Return a new QuerySet instance that will select only distinct results.
- extra(select=None, where=None, params=None, tables=None, order_by=None, select_params=None)¶
Add extra SQL fragments to the query.
- reverse()¶
Reverse the ordering of the QuerySet.
- defer(*fields)¶
Defer the loading of data for certain fields until they are accessed. Add the set of deferred fields to any existing set of deferred fields. The only exception to this is if None is passed in as the only parameter, in which case remove all deferrals.
- only(*fields)¶
Essentially, the opposite of defer(). Only the fields passed into this method and that are not already specified as deferred are loaded immediately when the queryset is evaluated.
- using(alias)¶
Select which database this QuerySet should execute against.
- resolve_expression(*args, **kwargs)¶
- class lamindb.models.QuerySet(model=None, query=None, using=None, hints=None)¶
Sets of records returned by queries.
Implements additional filtering capabilities.
See also
Examples
>>> ULabel(name="my label").save() >>> queryset = ULabel.filter(name="my label") >>> queryset # an instance of QuerySet
- property db¶
Return the database used if this query is executed now.
- property ordered¶
Return True if the QuerySet is ordered – i.e. has an order_by() clause or a default ordering on the model (or is empty).
- property query¶
- classmethod as_manager()¶
- get(idlike=None, **expressions)¶
Query a single record. Raises error if there are more or none.
- Return type:
- to_dataframe(*, include=None, features=None, limit=100, order_by='-id')¶
Evaluate and convert to
pd.DataFrame.By default, maps simple fields and foreign keys onto
DataFramecolumns.Guide: Query & search registries
- Parameters:
include (
str|list[str] |None, default:None) – Related data to include as columns. Takes strings of form"records__name","cell_types__name", etc. or a list of such strings. ForArtifact,Record, andRun, can also pass"features"to include features with data types pointing to entities in the core schema. If"privates", includes private fields (fields starting with_).features (
str|list[str] |None, default:None) – Configure the features to include. Can be a feature name or a list of such names. If"queryset", infers the features used within the current queryset. Only available forArtifact,Record, andRun.limit (
int|None, default:100) – Maximum number of rows to display. IfNone, includes all results.order_by (
str|None, default:'-id') – Field name to order the records by. Prefix with ‘-’ for descending order. Defaults to ‘-id’ to get the most recent records. This argument is ignored if the queryset is already ordered or if the specified field does not exist.
- Return type:
DataFrame
Examples
Include the name of the creator:
ln.Record.to_dataframe(include="created_by__name"])
Include features:
ln.Artifact.to_dataframe(include="features")
Include selected features:
ln.Artifact.to_dataframe(features=["cell_type_by_expert", "cell_type_by_model"])
- delete(*args, permanent=None, **kwargs)¶
Delete all records in the query set.
- Parameters:
permanent (
bool|None, default:None) – Whether to permanently delete the record (skips trash). Is only relevant for records that have thebranchfield. IfNone, uses soft delete for records that have thebranchfield, hard delete otherwise.
Note
Calling
delete()twice on the same queryset does NOT permanently delete in bulk operations. Usepermanent=Truefor actual deletion.Examples
For any
QuerySetobjectqs, call:>>> qs.delete()
- to_list(field=None)¶
Populate an (unordered) list with the results.
Note that the order in this list is only meaningful if you ordered the underlying query set with
.order_by().- Return type:
list[SQLRecord] |list[str]
Examples
>>> queryset.to_list() # list of records >>> queryset.to_list("name") # list of values
- first()¶
If non-empty, the first result in the query set, otherwise
None.- Return type:
SQLRecord|None
Examples
>>> queryset.first()
- one_or_none()¶
At most one result. Returns it if there is one, otherwise returns
None.- Return type:
SQLRecord|None
Examples
>>> ULabel.filter(name="benchmark").one_or_none() >>> ULabel.filter(name="non existing label").one_or_none()
- search(string, **kwargs)¶
Search.
- Parameters:
string (
str) – The input string to match against the field ontology values.field – The field or fields to search. Search all string fields by default.
limit – Maximum amount of top results to return.
case_sensitive – Whether the match is case sensitive.
- Returns:
A sorted
DataFrameof search results with a score in columnscore. Ifreturn_querysetisTrue.QuerySet.
Examples
records = ln.Record.from_values(["Label1", "Label2", "Label3"], field="name").save() ln.Record.search("Label2")
- lookup(field=None, **kwargs)¶
Return an auto-complete object for a field.
- Parameters:
field (
str|DeferredAttribute|None, default:None) – The field to look up the values for. Defaults to first string field.return_field – The field to return. If
None, returns the whole record.keep – When multiple records are found for a lookup, how to return the records. -
"first": return the first record. -"last": return the last record. -False: return all records.
- Return type:
NamedTuple- Returns:
A
NamedTupleof lookup information of the field values with a dictionary converter.
See also
Examples
Lookup via auto-complete on
.:import bionty as bt bt.Gene.from_source(symbol="ADGB-DT").save() lookup = bt.Gene.lookup() lookup.adgb_dt
Look up via auto-complete in dictionary:
lookup_dict = lookup.dict() lookup_dict['ADGB-DT']
Look up via a specific field:
lookup_by_ensembl_id = bt.Gene.lookup(field="ensembl_gene_id") genes.ensg00000002745
Return a specific field value instead of the full record:
lookup_return_symbols = bt.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
- validate(values, field=None, **kwargs)¶
Validate values against existing values of a string field.
Note this is strict_source validation, only asserts exact matches.
- Parameters:
values (
list[str] |Series|array) – Values that will be validated against the field.field (
str|DeferredAttribute|None, default:None) – The field of values. Examples are'ontology_id'to map against the source ID or'name'to map against the ontologies field names.mute – Whether to mute logging.
organism – An Organism name or record.
source – A
bionty.Sourcerecord that specifies the version to validate against.strict_source – Determines the validation behavior against records in the registry. - If
False, validation will include all records in the registry, ignoring the specified source. - IfTrue, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.
- Returns:
A vector of booleans indicating if an element is validated.
See also
Example:
import bionty as bt bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save() gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"] bt.Gene.validate(gene_symbols, field=bt.Gene.symbol, organism="human") #> array([ True, True, False, False])
- inspect(values, field=None, **kwargs)¶
Inspect if values are mappable to a field.
Being mappable means that an exact match exists.
- Parameters:
values (
list[str] |Series|array) – Values that will be checked against the field.field (
str|DeferredAttribute|None, default:None) – The field of values. Examples are'ontology_id'to map against the source ID or'name'to map against the ontologies field names.mute – Whether to mute logging.
organism – An Organism name or record.
source – A
bionty.Sourcerecord that specifies the version to inspect against.strict_source – Determines the validation behavior against records in the registry. - If
False, validation will include all records in the registry, ignoring the specified source. - IfTrue, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.
See also
Example:
import bionty as bt # save some gene records bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save() # inspect gene symbols gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"] result = bt.Gene.inspect(gene_symbols, field=bt.Gene.symbol, organism="human") assert result.validated == ["A1CF", "A1BG"] assert result.non_validated == ["FANCD1", "FANCD20"]
- standardize(values, field=None, **kwargs)¶
Maps input synonyms to standardized names.
- Parameters:
values (
Iterable) – Identifiers that will be standardized.field (
str|DeferredAttribute|None, default:None) – The field representing the standardized names.return_field – The field to return. Defaults to field.
return_mapper – If
True, returns{input_value: standardized_name}.case_sensitive – Whether the mapping is case sensitive.
mute – Whether to mute logging.
source_aware – Whether to standardize from public source. Defaults to
Truefor BioRecord registries.keep –
When a synonym maps to multiple names, determines which duplicates to mark as
pd.DataFrame.duplicated: -"first": returns the first mapped standardized name -"last": returns the last mapped standardized name -False: returns all mapped standardized name.When
keepisFalse, the returned list of standardized names will contain nested lists in case of duplicates.When a field is converted into return_field, keep marks which matches to keep when multiple return_field values map to the same field value.
synonyms_field – A field containing the concatenated synonyms.
organism – An Organism name or record.
source – A
bionty.Sourcerecord that specifies the version to validate against.strict_source – Determines the validation behavior against records in the registry. - If
False, validation will include all records in the registry, ignoring the specified source. - IfTrue, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.
- Returns:
If
return_mapperisFalse– a list of standardized names. Otherwise, a dictionary of mapped values with mappable synonyms as keys and standardized names as values.
See also
add_synonym()Add synonyms.
remove_synonym()Remove synonyms.
Example:
import bionty as bt # save some gene records bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save() # standardize gene synonyms gene_synonyms = ["A1CF", "A1BG", "FANCD1", "FANCD20"] bt.Gene.standardize(gene_synonyms) #> ['A1CF', 'A1BG', 'BRCA2', 'FANCD20']
- iterator(chunk_size=None)¶
An iterator over the results from applying this QuerySet to the database. chunk_size must be provided for QuerySets that prefetch related objects. Otherwise, a default chunk_size of 2000 is supplied.
- async aiterator(chunk_size=2000)¶
An asynchronous iterator over the results from applying this QuerySet to the database.
- aggregate(*args, **kwargs)¶
Return a dictionary containing the calculations (aggregation) over the current queryset.
If args is present the expression is passed as a kwarg using the Aggregate object’s default alias.
- async aaggregate(*args, **kwargs)¶
- count()¶
Perform a SELECT COUNT() and return the number of records as an integer.
If the QuerySet is already fully cached, return the length of the cached results set to avoid multiple SELECT COUNT(*) calls.
- async acount()¶
- async aget(*args, **kwargs)¶
- create(**kwargs)¶
Create a new object with the given kwargs, saving it to the database and returning the created object.
- async acreate(**kwargs)¶
- bulk_create(objs, batch_size=None, ignore_conflicts=False, update_conflicts=False, update_fields=None, unique_fields=None)¶
Insert each of the instances into the database. Do not call save() on each of the instances, do not send any pre/post_save signals, and do not set the primary key attribute if it is an autoincrement field (except if features.can_return_rows_from_bulk_insert=True). Multi-table models are not supported.
- async abulk_create(objs, batch_size=None, ignore_conflicts=False, update_conflicts=False, update_fields=None, unique_fields=None)¶
- bulk_update(objs, fields, batch_size=None)¶
Update the given fields in each of the given objects in the database.
- async abulk_update(objs, fields, batch_size=None)¶
- get_or_create(defaults=None, **kwargs)¶
Look up an object with the given kwargs, creating one if necessary. Return a tuple of (object, created), where created is a boolean specifying whether an object was created.
- async aget_or_create(defaults=None, **kwargs)¶
- update_or_create(defaults=None, create_defaults=None, **kwargs)¶
Look up an object with the given kwargs, updating one with defaults if it exists, otherwise create a new one. Optionally, an object can be created with different values than defaults by using create_defaults. Return a tuple (object, created), where created is a boolean specifying whether an object was created.
- async aupdate_or_create(defaults=None, create_defaults=None, **kwargs)¶
- earliest(*fields)¶
- async aearliest(*fields)¶
- latest(*fields)¶
Return the latest object according to fields (if given) or by the model’s Meta.get_latest_by.
- async alatest(*fields)¶
- async afirst()¶
- last()¶
Return the last object of a query or None if no match is found.
- async alast()¶
- in_bulk(id_list=None, *, field_name='pk')¶
Return a dictionary mapping each of the given IDs to the object with that ID. If
id_listisn’t provided, evaluate the entire QuerySet.
- async ain_bulk(id_list=None, *, field_name='pk')¶
- update(**kwargs)¶
Update all elements in the current QuerySet, setting all the given fields to the appropriate values.
- async aupdate(**kwargs)¶
- exists()¶
Return True if the QuerySet would have any results, False otherwise.
- async aexists()¶
- contains(obj)¶
Return True if the QuerySet contains the provided obj, False otherwise.
- async acontains(obj)¶
- explain(*, format=None, **options)¶
Runs an EXPLAIN on the SQL query this QuerySet would perform, and returns the results.
- async aexplain(*, format=None, **options)¶
- raw(raw_query, params=(), translations=None, using=None)¶
- values(*fields, **expressions)¶
- values_list(*fields, flat=False, named=False)¶
- dates(field_name, kind, order='ASC')¶
Return a list of date objects representing all available dates for the given field_name, scoped to ‘kind’.
- datetimes(field_name, kind, order='ASC', tzinfo=None)¶
Return a list of datetime objects representing all available datetimes for the given field_name, scoped to ‘kind’.
- none()¶
Return an empty QuerySet.
- all()¶
Return a new QuerySet that is a copy of the current one. This allows a QuerySet to proxy for a model manager in some cases.
- exclude(*args, **kwargs)¶
Return a new QuerySet instance with NOT (args) ANDed to the existing set.
- complex_filter(filter_obj)¶
Return a new QuerySet instance with filter_obj added to the filters.
filter_obj can be a Q object or a dictionary of keyword lookup arguments.
This exists to support framework features such as ‘limit_choices_to’, and usually it will be more natural to use other methods.
- union(*other_qs, all=False)¶
- intersection(*other_qs)¶
- difference(*other_qs)¶
- select_for_update(nowait=False, skip_locked=False, of=(), no_key=False)¶
Return a new QuerySet instance that will select objects with a FOR UPDATE lock.
Return a new QuerySet instance that will select related objects.
If fields are specified, they must be ForeignKey fields and only those related objects are included in the selection.
If select_related(None) is called, clear the list.
Return a new QuerySet instance that will prefetch the specified Many-To-One and Many-To-Many related objects when the QuerySet is evaluated.
When prefetch_related() is called more than once, append to the list of prefetch lookups. If prefetch_related(None) is called, clear the list.
- annotate(*args, **kwargs)¶
Return a query set in which the returned objects have been annotated with extra data or aggregations.
- alias(*args, **kwargs)¶
Return a query set with added aliases for extra data or aggregations.
- order_by(*field_names)¶
Return a new QuerySet instance with the ordering changed.
- distinct(*field_names)¶
Return a new QuerySet instance that will select only distinct results.
- extra(select=None, where=None, params=None, tables=None, order_by=None, select_params=None)¶
Add extra SQL fragments to the query.
- reverse()¶
Reverse the ordering of the QuerySet.
- defer(*fields)¶
Defer the loading of data for certain fields until they are accessed. Add the set of deferred fields to any existing set of deferred fields. The only exception to this is if None is passed in as the only parameter, in which case remove all deferrals.
- only(*fields)¶
Essentially, the opposite of defer(). Only the fields passed into this method and that are not already specified as deferred are loaded immediately when the queryset is evaluated.
- using(alias)¶
Select which database this QuerySet should execute against.
- resolve_expression(*args, **kwargs)¶
- class lamindb.models.QueryDB(instance)¶
Convenient access to QuerySets for every entity in a LaminDB instance.
- Parameters:
instance (
str) – Instance identifier in format “account/instance” or full instance string.
Examples
Query records from a remote instance:
cellxgene = ln.QueryDB("laminlabs/cellxgene") artifacts = cellxgene.artifacts.filter(suffix=".h5ad") records = cellxgene.records.filter(name__startswith="cell")
- class lamindb.models.ArtifactSet¶
Abstract class representing sets of artifacts returned by queries.
This class automatically extends
BasicQuerySetandQuerySetwhen the base model isArtifact.Examples
>>> artifacts = ln.Artifact.filter(otype="AnnData") >>> artifacts # an instance of ArtifactQuerySet inheriting from ArtifactSet
- load(join='outer', is_run_input=None, **kwargs)¶
Cache and load to memory.
Returns an in-memory concatenated
DataFrameorAnnDataobject.- Return type:
DataFrame|AnnData
- open(engine='pyarrow', is_run_input=None, **kwargs)¶
Open a dataset for streaming.
Works for
pyarrowandpolarscompatible formats (.parquet,.csv,.ipcetc. files or directories with such files).- Parameters:
engine (
Literal['pyarrow','polars'], default:'pyarrow') – Which module to use for lazy loading of a dataframe frompyarroworpolarscompatible formats.is_run_input (
bool|None, default:None) – Whether to track this artifact as run input.**kwargs – Keyword arguments for
pyarrow.dataset.datasetorpolars.scan_*functions.
- Return type:
Dataset|Iterator[LazyFrame]
Notes
For more info, see guide: Slice & stream arrays.
- mapped(layers_keys=None, obs_keys=None, obsm_keys=None, obs_filter=None, join='inner', encode_labels=True, unknown_label=None, cache_categories=True, parallel=False, dtype=None, stream=False, is_run_input=None)¶
Return a map-style dataset.
Returns a pytorch map-style dataset by virtually concatenating
AnnDataarrays.By default (
stream=False)AnnDataarrays are moved into a local cache first.__getitem__of theMappedCollectionobject takes a single integer index and returns a dictionary with the observation data sample for this index from theAnnDataobjects in the collection. The dictionary has keys forlayers_keys(.Xis in"X"),obs_keys,obsm_keys(underf"obsm_{key}") and also"_store_idx"for the index of theAnnDataobject containing this observation sample.Note
For a guide, see Train a machine learning model on a collection.
This method currently only works for collections or query sets of
AnnDataartifacts.- Parameters:
layers_keys (
str|list[str] |None, default:None) – Keys from the.layersslot.layers_keys=Noneor"X"in the list retrieves.X.obs_keys (
str|list[str] |None, default:None) – Keys from the.obsslots.obsm_keys (
str|list[str] |None, default:None) – Keys from the.obsmslots.obs_filter (
dict[str,str|list[str]] |None, default:None) – Select only observations with these values for the given obs columns. Should be a dictionary with obs column names as keys and filtering values (a string or a list of strings) as values.join (
Literal['inner','outer'] |None, default:'inner') –"inner"or"outer"virtual joins. IfNoneis passed, does not join.encode_labels (
bool|list[str], default:True) – Encode labels into integers. Can be a list with elements fromobs_keys.unknown_label (
str|dict[str,str] |None, default:None) – Encode this label to -1. Can be a dictionary with keys fromobs_keysifencode_labels=Trueor fromencode_labelsif it is a list.cache_categories (
bool, default:True) – Enable caching categories ofobs_keysfor faster access.parallel (
bool, default:False) – Enable sampling with multiple processes.dtype (
str|None, default:None) – Convert numpy arrays from.X,.layersand.obsmstream (
bool, default:False) – Whether to stream data from the array backend.is_run_input (
bool|None, default:None) – Whether to track this collection as run input.
- Return type:
Examples
>>> import lamindb as ln >>> from torch.utils.data import DataLoader >>> ds = ln.Collection.get(description="my collection") >>> mapped = collection.mapped(obs_keys=["cell_type", "batch"]) >>> dl = DataLoader(mapped, batch_size=128, shuffle=True) >>> # also works for query sets of artifacts, '...' represents some filtering condition >>> # additional filtering on artifacts of the collection >>> mapped = collection.artifacts.all().filter(...).order_by("-created_at").mapped() >>> # or directly from a query set of artifacts >>> mapped = ln.Artifact.filter(..., otype="AnnData").order_by("-created_at").mapped()
- class lamindb.models.QueryManager(*args, **kwargs)¶
Manage queries through fields.
See also
Examples
Populate the
.parentsManyToMany relationship (aQueryManager):ln.Record.from_values(["Label1", "Label2", "Label3"], field="name")).save() labels = ln.Record.filter(name__icontains="label") label1 = ln.Record.get(name="Label1") label1.parents.set(labels)
Convert all linked parents to a
DataFrame:label1.parents.to_dataframe()
- auto_created = False¶
- creation_counter = 43¶
- property db¶
- use_in_migrations = False¶
If set to True the manager will be serialized into migrations and will thus be available in e.g. RunPython operations.
- classmethod from_queryset(queryset_class, class_name=None)¶
- track_run_input_manager()¶
- to_list(field=None)¶
Populate a list.
- to_dataframe(**kwargs)¶
Convert to DataFrame.
For
**kwargs, seelamindb.models.QuerySet.to_dataframe().
- all()¶
Return QuerySet of all.
For
**kwargs, seelamindb.models.QuerySet.to_dataframe().
- search(string, **kwargs)¶
Search.
- Parameters:
string (
str) – The input string to match against the field ontology values.field – The field or fields to search. Search all string fields by default.
limit – Maximum amount of top results to return.
case_sensitive – Whether the match is case sensitive.
- Returns:
A sorted
DataFrameof search results with a score in columnscore. Ifreturn_querysetisTrue.QuerySet.
Examples
records = ln.Record.from_values(["Label1", "Label2", "Label3"], field="name").save() ln.Record.search("Label2")
- lookup(field=None, **kwargs)¶
Return an auto-complete object for a field.
- Parameters:
field (
str|DeferredAttribute|None, default:None) – The field to look up the values for. Defaults to first string field.return_field – The field to return. If
None, returns the whole record.keep – When multiple records are found for a lookup, how to return the records. -
"first": return the first record. -"last": return the last record. -False: return all records.
- Return type:
NamedTuple- Returns:
A
NamedTupleof lookup information of the field values with a dictionary converter.
See also
Examples
Lookup via auto-complete on
.:import bionty as bt bt.Gene.from_source(symbol="ADGB-DT").save() lookup = bt.Gene.lookup() lookup.adgb_dt
Look up via auto-complete in dictionary:
lookup_dict = lookup.dict() lookup_dict['ADGB-DT']
Look up via a specific field:
lookup_by_ensembl_id = bt.Gene.lookup(field="ensembl_gene_id") genes.ensg00000002745
Return a specific field value instead of the full record:
lookup_return_symbols = bt.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
- get_queryset()¶
- aaggregate(*args, **kwargs)¶
- abulk_create(objs, batch_size=None, ignore_conflicts=False, update_conflicts=False, update_fields=None, unique_fields=None)¶
- abulk_update(objs, fields, batch_size=None)¶
- acontains(obj)¶
- acount()¶
- acreate(**kwargs)¶
- aearliest(*fields)¶
- aexists()¶
- aexplain(*, format=None, **options)¶
- afirst()¶
- aget(*args, **kwargs)¶
- aget_or_create(defaults=None, **kwargs)¶
- aggregate(*args, **kwargs)¶
Return a dictionary containing the calculations (aggregation) over the current queryset.
If args is present the expression is passed as a kwarg using the Aggregate object’s default alias.
- ain_bulk(id_list=None, *, field_name='pk')¶
- aiterator(chunk_size=2000)¶
An asynchronous iterator over the results from applying this QuerySet to the database.
- alast()¶
- alatest(*fields)¶
- alias(*args, **kwargs)¶
Return a query set with added aliases for extra data or aggregations.
- annotate(*args, **kwargs)¶
Return a query set in which the returned objects have been annotated with extra data or aggregations.
- aupdate(**kwargs)¶
- aupdate_or_create(defaults=None, create_defaults=None, **kwargs)¶
- bulk_create(objs, batch_size=None, ignore_conflicts=False, update_conflicts=False, update_fields=None, unique_fields=None)¶
Insert each of the instances into the database. Do not call save() on each of the instances, do not send any pre/post_save signals, and do not set the primary key attribute if it is an autoincrement field (except if features.can_return_rows_from_bulk_insert=True). Multi-table models are not supported.
- bulk_update(objs, fields, batch_size=None)¶
Update the given fields in each of the given objects in the database.
- complex_filter(filter_obj)¶
Return a new QuerySet instance with filter_obj added to the filters.
filter_obj can be a Q object or a dictionary of keyword lookup arguments.
This exists to support framework features such as ‘limit_choices_to’, and usually it will be more natural to use other methods.
- contains(obj)¶
Return True if the QuerySet contains the provided obj, False otherwise.
- count()¶
Perform a SELECT COUNT() and return the number of records as an integer.
If the QuerySet is already fully cached, return the length of the cached results set to avoid multiple SELECT COUNT(*) calls.
- create(**kwargs)¶
Create a new object with the given kwargs, saving it to the database and returning the created object.
- dates(field_name, kind, order='ASC')¶
Return a list of date objects representing all available dates for the given field_name, scoped to ‘kind’.
- datetimes(field_name, kind, order='ASC', tzinfo=None)¶
Return a list of datetime objects representing all available datetimes for the given field_name, scoped to ‘kind’.
- defer(*fields)¶
Defer the loading of data for certain fields until they are accessed. Add the set of deferred fields to any existing set of deferred fields. The only exception to this is if None is passed in as the only parameter, in which case remove all deferrals.
- difference(*other_qs)¶
- distinct(*field_names)¶
Return a new QuerySet instance that will select only distinct results.
- earliest(*fields)¶
- exclude(*args, **kwargs)¶
Return a new QuerySet instance with NOT (args) ANDed to the existing set.
- exists()¶
Return True if the QuerySet would have any results, False otherwise.
- explain(*, format=None, **options)¶
Runs an EXPLAIN on the SQL query this QuerySet would perform, and returns the results.
- extra(select=None, where=None, params=None, tables=None, order_by=None, select_params=None)¶
Add extra SQL fragments to the query.
- filter(*args, **kwargs)¶
Return a new QuerySet instance with the args ANDed to the existing set.
- first()¶
Return the first object of a query or None if no match is found.
- get(*args, **kwargs)¶
Perform the query and return a single object matching the given keyword arguments.
- get_or_create(defaults=None, **kwargs)¶
Look up an object with the given kwargs, creating one if necessary. Return a tuple of (object, created), where created is a boolean specifying whether an object was created.
- in_bulk(id_list=None, *, field_name='pk')¶
Return a dictionary mapping each of the given IDs to the object with that ID. If
id_listisn’t provided, evaluate the entire QuerySet.
- intersection(*other_qs)¶
- iterator(chunk_size=None)¶
An iterator over the results from applying this QuerySet to the database. chunk_size must be provided for QuerySets that prefetch related objects. Otherwise, a default chunk_size of 2000 is supplied.
- last()¶
Return the last object of a query or None if no match is found.
- latest(*fields)¶
Return the latest object according to fields (if given) or by the model’s Meta.get_latest_by.
- none()¶
Return an empty QuerySet.
- only(*fields)¶
Essentially, the opposite of defer(). Only the fields passed into this method and that are not already specified as deferred are loaded immediately when the queryset is evaluated.
- order_by(*field_names)¶
Return a new QuerySet instance with the ordering changed.
Return a new QuerySet instance that will prefetch the specified Many-To-One and Many-To-Many related objects when the QuerySet is evaluated.
When prefetch_related() is called more than once, append to the list of prefetch lookups. If prefetch_related(None) is called, clear the list.
- raw(raw_query, params=(), translations=None, using=None)¶
- reverse()¶
Reverse the ordering of the QuerySet.
- select_for_update(nowait=False, skip_locked=False, of=(), no_key=False)¶
Return a new QuerySet instance that will select objects with a FOR UPDATE lock.
Return a new QuerySet instance that will select related objects.
If fields are specified, they must be ForeignKey fields and only those related objects are included in the selection.
If select_related(None) is called, clear the list.
- union(*other_qs, all=False)¶
- update(**kwargs)¶
Update all elements in the current QuerySet, setting all the given fields to the appropriate values.
- update_or_create(defaults=None, create_defaults=None, **kwargs)¶
Look up an object with the given kwargs, updating one with defaults if it exists, otherwise create a new one. Optionally, an object can be created with different values than defaults by using create_defaults. Return a tuple (object, created), where created is a boolean specifying whether an object was created.
- using(alias)¶
Select which database this QuerySet should execute against.
- values(*fields, **expressions)¶
- values_list(*fields, flat=False, named=False)¶
- deconstruct()¶
Return a 5-tuple of the form (as_manager (True), manager_class, queryset_class, args, kwargs).
Raise a ValueError if the manager is dynamically generated.
- check(**kwargs)¶
- contribute_to_class(cls, name)¶
- db_manager(using=None, hints=None)¶
Storage of feature values¶
- class lamindb.models.FeatureValue(*args, **kwargs)¶
Non-categorical features values.
Categorical feature values are stored in their respective registries:
ULabel,CellType, etc.Unlike for ULabel, in
FeatureValue, values are grouped by features and not by an ontological hierarchy.Simple fields¶
- value: Any¶
The JSON-like value.
- hash: str¶
Value hash.
- is_locked: bool¶
Whether the record is locked for edits.
- created_at: datetime¶
Time of creation of record.
Relational fields¶
Class methods¶
- classmethod get_or_create(feature, value)¶
- classmethod filter(*queries, **expressions)¶
Query records.
- Parameters:
queries – One or multiple
Qobjects.expressions – Fields and values passed as Django query expressions.
- Return type:
See also
Guide: Query & search registries
Django documentation: Queries
Examples
>>> ln.Project(name="my label").save() >>> ln.Project.filter(name__startswith="my").to_dataframe()
- classmethod get(idlike=None, **expressions)¶
Get a single record.
- Parameters:
idlike (
int|str|None, default:None) – Either a uid stub, uid or an integer id.expressions – Fields and values passed as Django query expressions.
- Raises:
lamindb.errors.DoesNotExist – In case no matching record is found.
- Return type:
See also
Guide: Query & search registries
Django documentation: Queries
Examples
record = ln.Record.get("FvtpPJLJ") record = ln.Record.get(name="my-label")
- classmethod to_dataframe(include=None, features=False, limit=100)¶
Evaluate and convert to
pd.DataFrame.By default, maps simple fields and foreign keys onto
DataFramecolumns.Guide: Query & search registries
- Parameters:
include (
str|list[str] |None, default:None) – Related data to include as columns. Takes strings of form"records__name","cell_types__name", etc. or a list of such strings. ForArtifact,Record, andRun, can also pass"features"to include features with data types pointing to entities in the core schema. If"privates", includes private fields (fields starting with_).features (
bool|list[str], default:False) – Configure the features to include. Can be a feature name or a list of such names. If"queryset", infers the features used within the current queryset. Only available forArtifact,Record, andRun.limit (
int, default:100) – Maximum number of rows to display. IfNone, includes all results.order_by – Field name to order the records by. Prefix with ‘-’ for descending order. Defaults to ‘-id’ to get the most recent records. This argument is ignored if the queryset is already ordered or if the specified field does not exist.
- Return type:
DataFrame
Examples
Include the name of the creator:
ln.Record.to_dataframe(include="created_by__name"])
Include features:
ln.Artifact.to_dataframe(include="features")
Include selected features:
ln.Artifact.to_dataframe(features=["cell_type_by_expert", "cell_type_by_model"])
- classmethod search(string, *, field=None, limit=20, case_sensitive=False)¶
Search.
- Parameters:
string (
str) – The input string to match against the field ontology values.field (
str|DeferredAttribute|None, default:None) – The field or fields to search. Search all string fields by default.limit (
int|None, default:20) – Maximum amount of top results to return.case_sensitive (
bool, default:False) – Whether the match is case sensitive.
- Return type:
- Returns:
A sorted
DataFrameof search results with a score in columnscore. Ifreturn_querysetisTrue.QuerySet.
Examples
records = ln.Record.from_values(["Label1", "Label2", "Label3"], field="name").save() ln.Record.search("Label2")
- classmethod lookup(field=None, return_field=None)¶
Return an auto-complete object for a field.
- Parameters:
field (
str|DeferredAttribute|None, default:None) – The field to look up the values for. Defaults to first string field.return_field (
str|DeferredAttribute|None, default:None) – The field to return. IfNone, returns the whole record.keep – When multiple records are found for a lookup, how to return the records. -
"first": return the first record. -"last": return the last record. -False: return all records.
- Return type:
NamedTuple- Returns:
A
NamedTupleof lookup information of the field values with a dictionary converter.
See also
Examples
Lookup via auto-complete on
.:import bionty as bt bt.Gene.from_source(symbol="ADGB-DT").save() lookup = bt.Gene.lookup() lookup.adgb_dt
Look up via auto-complete in dictionary:
lookup_dict = lookup.dict() lookup_dict['ADGB-DT']
Look up via a specific field:
lookup_by_ensembl_id = bt.Gene.lookup(field="ensembl_gene_id") genes.ensg00000002745
Return a specific field value instead of the full record:
lookup_return_symbols = bt.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
Methods¶
- restore()¶
Restore from trash onto the main branch.
Does not restore descendant records if the record is
HasTypewithis_type = True.- Return type:
None
- delete(permanent=None, **kwargs)¶
Delete record.
If record is
HasTypewithis_type = True, deletes all descendant records, too.- Parameters:
permanent (
bool|None, default:None) – Whether to permanently delete the record (skips trash). IfNone, performs soft delete if the record is not already in the trash.- Return type:
None
Examples
For any
SQLRecordobjectrecord, call:>>> record.delete()
- save(*args, **kwargs)¶
Save.
Always saves to the default database.
- Return type:
TypeVar(T, bound= SQLRecord)
- refresh_from_db(using=None, fields=None, from_queryset=None)¶
Reload field values from the database.
By default, the reloading happens from the database this instance was loaded from, or by the read router if this instance wasn’t loaded from any database. The using parameter will override the default.
Fields can be used to specify which fields to reload. The fields should be an iterable of field attnames. If fields is None, then all non-deferred fields are reloaded.
When accessing deferred fields of an instance, the deferred loading of the field will call this method.
- async arefresh_from_db(using=None, fields=None, from_queryset=None)¶
Utility classes¶
- class lamindb.models.LazyArtifact(suffix, overwrite_versions, **kwargs)¶
Lazy artifact for streaming to auto-generated internal paths.
This is needed when it is desirable to stream to a
lamindbauto-generated internal path and register the path as an artifact (seeArtifact).This object creates a real artifact on
.save()with the provided arguments.- Parameters:
suffix (
str) – The suffix for the auto-generated internal pathoverwrite_versions (
bool) – Whether to overwrite versions.**kwargs – Keyword arguments for the artifact to be created.
Examples
Create a lazy artifact, write to the path and save to get a real artifact:
lazy = ln.Artifact.from_lazy(suffix=".zarr", overwrite_versions=True, key="mydata.zarr") zarr.open(lazy.path, mode="w")["test"] = np.array(["test"]) # stream to the path artifact = lazy.save()
- class lamindb.models.SQLRecordList(records)¶
Is ordered, can’t be queried, but has
.to_dataframe().- to_dataframe()¶
- Return type:
DataFrame
- to_list(field)¶
- Return type:
list[str]
- one()¶
Exactly one result. Throws error if there are more or none.
- Return type:
TypeVar(T)
- save()¶
Save all records to the database.
- Return type:
SQLRecordList[TypeVar(T)]
- append(item)¶
- insert(i, item)¶
- pop(i=-1)¶
- remove(item)¶
- clear()¶
- copy()¶
- count(item)¶
- index(item, *args)¶
- reverse()¶
- sort(*args, **kwds)¶
- extend(other)¶
- class lamindb.models.InspectResult(validated_df, validated, nonvalidated, frac_validated, n_empty, n_unique)¶
Result of inspect.
An InspectResult object of calls such as
inspect().- property df: DataFrame¶
A DataFrame indexed by values with a boolean
__validated__column.
- property frac_validated: float¶
Fraction of items that were validated.
- property n_empty: int¶
Number of empty items.
- property n_unique: int¶
Number of unique items.
- property non_validated: list[str]¶
List of unsuccessfully
validate()items.This list can be used to remove any non-validated values such as genes that do not map against the specified source.
- property synonyms_mapper: dict¶
Synonyms mapper dictionary.
Such a dictionary maps the actual values to their synonyms which can be used to rename values accordingly.
Examples
>>> markers = pd.DataFrame(index=["KI67","CCR7"]) >>> synonyms_mapper = bt.CellMarker.standardize(markers.index, return_mapper=True)
{‘KI67’: ‘Ki67’, ‘CCR7’: ‘Ccr7’}
- property validated: list[str]¶
List of successfully
validate()validated items.
- class lamindb.models.ValidateFields¶
- class lamindb.models.SchemaOptionals(schema)¶
Manage and access optional features in a schema.
- get_uids()¶
Get the uids of the optional features.
Does not need an additional query to the database, while
get()does.- Return type:
list[str]
- set(features)¶
Set the optional features (overwrites whichever schemas are currently optional).
- Return type:
None
- remove(features)¶
Make one or multiple features required by removing them from the set of optional features.
- Return type:
None
- add(features)¶
Make one or multiple features optional by adding them to the set of optional features.
- Return type:
None