Hi ,
I have been using ray.util.sgd.data.Dataset with pytorch . The sole purpose of this API is to ingest huge data into ray and train it.
In my case when trying to train the data .It displays
{‘num_samples’: 351, ‘epoch’: 1.0, ‘batch_count’: 1.0, ‘train_loss’: 1.201025366783142, ‘last_train_loss’: 1.201025366783142}
{‘num_samples’: 0, ‘epoch’: 2.0, ‘batch_count’: 0.0}
{‘num_samples’: 0, ‘epoch’: 3.0, ‘batch_count’: 0.0}
All the samples are loaded into the first epoch always. The other epochs are all empty.
My code snippet
MyTrainingOperator = TrainingOperator.from_creators(
model_creator= model_creator, optimizer_creator=optimizer_creator,
loss_creator= torch.nn.CrossEntropyLoss, scheduler_creator=scheduler_creator,
data_creator=None)
trainer = TorchTrainer(
training_operator_cls=MyTrainingOperator,
scheduler_step_freq="epoch",
config={"batch_size": 64}
)
# fetching values from database using Dataset.
db_dataset = fetch_values_from_database()
for i in range(500):
# Train for another epoch using the dataset
stats = trainer.train(dataset=db_dataset , num_steps=200)
print(stats)
I can share more details on the dataset. Thanks in advance.