Skip to content

insert_all or active_record-import integration #1788

@Amnesthesia

Description

@Amnesthesia

Problem this feature will solve

When inserting large amounts of data, or running a large test suite where a lot of lists are created, this is generally slow because each record is created individually, and then callbacks fired, which doesn't scale very well.

In other parts of our app, we handle bulk inserts either with ActiveRecord::Import which lets us do recursive inserts, building out the hierarchy first and then optimizing the insert statements. We occasionally also use the now standard insert_all

Desired solution

Would love some way of integrating active_record-import or ActiveRecord#insert_all with FactoryBot, so that create_list would:

  1. Build out required associations, then insert those in bulk, and fire their callbacks
    2.Bulk insert the list of records and fire callbacks

Basically, a seemless, plug and play, solution that optimizes inserts for lists. Providing something like to_create { ... } but to_create_list { ... } would work too, but I think the issue is how the list strategies are generated and dont really allow for much customization

Alternatives considered

I first attempted to write my own strategy for doing this, but it turns out that all the _list strategies are really just i.times { send(strategy, ...) }, which means that with the current implementation, I couldn't see a good way of doing this.

I ended up writing a pretty hacky way to do something similar, but I feel like this hooks into too many internals, and doesn't actually handle association factories' callbacks (it just inserts the records):

FactoryBot.define_singleton_method(:import_list) do |name, amount, *traits_and_overrides, &block|
  unless amount.respond_to?(:times)
    raise ArgumentError, "count missing for import_list"
  end

  records = Array.new(amount) do |i|
    block_with_index = FactoryBot::StrategySyntaxMethodRegistrar.with_index(block, i)
    send(:build, name, *traits_and_overrides, &block_with_index)
  end

  # Collect belongs_to associations that haven't been persisted and need to be persisted
  # before we insert the list
  associated_records = {}

  # Proc to find unsaved belongs_to associations on a record and add to cache
  find_belongs_to_associations = lambda do |record|
    record.class.reflect_on_all_associations(:belongs_to).each do |association|
      associated_record = record.send(association.name)
      next unless associated_record.present?
      next if associated_record.persisted?
      associated_records[associated_record.class] ||= []
      next if associated_records[associated_record.class].include?(associated_record)
      associated_records[associated_record.class] << associated_record
      find_belongs_to_associations.call(associated_record)
    end
  end
 
  records.each do |record|
    find_belongs_to_associations.call(record)
  end

  # For each model, import the records in bulk
  associated_records.each_value do |assoc_records|
    next if assoc_records.empty?
    assoc_records.first.class.import!(assoc_records, recursive: true, ignore: true)
  end

  # Now import the main list of records
  records.first.class.import!(records.reject(&:id), recursive: true, ignore: true)

  overrides = traits_and_overrides.select { |t| t.is_a?(Hash) }.reduce({}, :merge)
  traits = traits_and_overrides.select { |t| t.is_a?(Symbol) }

  # Apply callbacks (here comes the hackiest bit .... This is all hooking into internals)
  factory = FactoryBot.factories[name]
  factory = factory.with_traits(traits) if traits.any?

  # We want to fire after(:create) callbacks after importing records
  strategy = FactoryBot.strategy_by_name(:create).new
  
  evaluator = factory.send(:evaluator_class).new(
    strategy,
    overrides.symbolize_keys
  )
  observer = FactoryBot::CallbacksObserver.new(factory.send(:callbacks), evaluator)
  attribute_assigner = FactoryBot::AttributeAssigner.new(
    evaluator,
    factory.send(:build_class)
  )
  evaluation = FactoryBot::Evaluation.new(
    evaluator,
    attribute_assigner,
    factory.send(:compiled_to_create),
    observer
  )

  # For each imported record, fire after_create. Unfortunately we dont have the factories
  # for the belongs_to records we imported, and cant fire their callbacks
  records.each do |record|
    evaluation.notify(:after_create, record)
  end

  records
end

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions