Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
pip3 install elbo --upgradepip3 install virtualenv virtualenvwrappervirtualenv -p python3 .venv~/Library/Python/3.9/bin/virtualenv -p python3 .venv. .venv/bin/activate. .venv/bin/activate.fishelbo logingit clone https://github.com/elbo-ai/elbo-examples.git
cd elbo-examples/pytorch/mnist_classifier/elbo run --config elbo.yamlelbo.client is starting 'Train MNIST Classifier' submission ...
elbo.client Hey Anu 👋, welcome!
elbo.client is uploading sources from ....
elbo.client upload successful.
elbo.client number of compute choices - 28
? Please choose: (Use arrow keys)
» $ 0.0028/hour Micro (for testing) 2 cpu 1Gb mem 0Gb gpu-mem AWS (spot)
$ 0.0150/hour Standard (for testing) 1 cpu 2Gb mem 0Gb gpu-mem Linode (~ 9 mins to provision)
$ 0.0770/hour Micro (for testing) 2 cpu 1Gb mem 0Gb gpu-mem AWS
$ 0.2700/hour Nvidia Tesla K80 4 cpu 61Gb mem 12Gb gpu-mem AWS (spot)
$ 0.6100/hour Nvidia Quadro 4000 16 cpu 32Gb mem 8Gb gpu-mem TensorDock
$ 0.9000/hour Nvidia Tesla K80 4 cpu 61Gb mem 12Gb gpu-mem AWS
$ 0.9180/hour Nvidia V100 8 cpu 61Gb mem 16Gb gpu-mem AWS (spot)
$ 0.9200/hour Nvidia Quadro 5000 2 cpu 4Gb mem 16Gb gpu-mem FluidStack
$ 0.9600/hour Nvidia A5000 2 cpu 16Gb mem 24Gb gpu-mem TensorDock
$ 1.4900/hour Nvidia A4000 12 cpu 64Gb mem 16Gb gpu-mem FluidStack
$ 1.4940/hour Nvidia A40 2 cpu 12Gb mem 48Gb gpu-mem TensorDock
$ 1.5000/hour Nvidia Quadro 6000 8 cpu 32Gb mem 0Gb gpu-mem Linode (~ 9 mins to provision)
$ 1.5140/hour Nvidia A6000 2 cpu 16Gb mem 48Gb gpu-mem TensorDock
$ 2.1600/hour 8x Nvidia Tesla K80 32 cpu 488Gb mem 12Gb gpu-mem AWS (spot)
$ 3.0000/hour 2x Nvidia Quadro 6000 16 cpu 64Gb mem 0Gb gpu-mem Linode (~ 9 mins to provision)
$ 3.0600/hour Nvidia V100 8 cpu 61Gb mem 16Gb gpu-mem AWS
$ 3.6720/hour 4x Nvidia V100 32 cpu 244Gb mem 16Gb gpu-mem AWS (spot)
$ 3.7460/hour 7x Nvidia V100 6 cpu 8Gb mem 16Gb gpu-mem TensorDock
$ 4.3200/hour 16x Nvidia Tesla K80 64 cpu 732Gb mem 12Gb gpu-mem AWS (spot)
$ 4.5000/hour 3x Nvidia Quadro 6000 20 cpu 96Gb mem 0Gb gpu-mem Linode (~ 9 mins to provision)
$ 6.0000/hour 4x Nvidia Quadro 6000 24 cpu 128Gb mem 0Gb gpu-mem Linode (~ 9 mins to provision)
$ 7.3440/hour 8x Nvidia V100 64 cpu 488Gb mem 16Gb gpu-mem AWS (spot)
$ 7.9200/hour 8x Nvidia Tesla K80 32 cpu 488Gb mem 12Gb gpu-mem AWS
$ 9.8318/hour 8x Nvidia A100 96 cpu 1152Gb mem 80Gb gpu-mem AWS (spot)
$13.0360/hour 4x Nvidia V100 32 cpu 244Gb mem 16Gb gpu-mem AWS
$14.4000/hour 16x Nvidia Tesla K80 64 cpu 732Gb mem 12Gb gpu-mem AWS
$24.4800/hour 8x Nvidia V100 64 cpu 488Gb mem 16Gb gpu-mem AWS
$32.7726/hour 8x Nvidia A100 96 cpu 1152Gb mem 80Gb gpu-mem AWSif __name__ == '__main__':
print(f"Training MNIST classifier")
train_data = datasets.MNIST("data", train=True, transform=transforms.ToTensor(), download=True)
test_data = datasets.MNIST("data", train=False, transform=transforms.ToTensor(), download=True)
model = MNISTClassifier()
num_epochs = 10
for epoch in elbo.elbo.ElboEpochIterator(range(0, num_epochs), model, save_state_interval=1):
loss = train(model, train_data)
print(f"Epoch = {epoch} Loss = {loss}")
test(model, test_data)The ELBO tracker allows you to monitor your ML tasks on your Phone using the ELBO Tracker App
from elbo.tracker.tracker import TaskTracker
tracker = TaskTracker("Hello World")tracker.log_message("Hi there! 👋") tracker.log_key_metric("Accuracy", 100.0)
tracker.log_image("An AI generated image of a Cat 🐱", "images/aicat.png")tracker.upload_logs()

(.venv) joy@elbo ~> elbo
Usage: elbo [OPTIONS] COMMAND [ARGS]...
elbo.ai - Train more, pay less
Options:
--help Show this message and exit.
Commands:
balance Show the users balance
create Create an instance and get SSH access to it.
download Download the artifacts for the task.
kill Stop the task.
login Login to the ELBO service.
notebook Start a Jupyter Lab session.
ps Show list of all tasks.
run Submit a task specified by the config file.
show Show the task.
ssh SSH into the machine running the task.
status Get ELBO server status.(.venv) joy@elbo ~/p/elbo-examples (main)> elbo notebook
elbo.client creating notebook using config at project [email protected]:elbo-ai/elbo-examples.git ...
elbo.client cloning [email protected]:elbo-ai/elbo-examples.git to /var/folders/8f/vcfd13292kl6p93zxf1yypl40000gn/T/tmpfl7mum90 ...
elbo.client Submitting notebook run config : /var/folders/8f/vcfd13292kl6p93zxf1yypl40000gn/T/tmpfl7mum90/notebook/elbo.yaml
elbo.client is starting 'Start a jupyter notebook' submission ...
elbo.client Hey Anu 👋, welcome!
elbo.client is uploading sources from /var/folders/8f/vcfd13292kl6p93zxf1yypl40000gn/T/tmpfl7mum90/notebook/....
elbo.client upload successful.
elbo.client number of compute choices - 28
? Please choose: $ 0.4200/hour Nvidia Quadro 4000 2 cpu 4Gb mem 8Gb gpu-mem FluidStack
elbo.client compute node ip 216.153.51.67
elbo.client task with ID 125 is submitted successfully.
elbo.client ----------------------------------------------
elbo.client ssh using - ssh [email protected] -p 2222
elbo.client scp using - scp [email protected] -p 2222
elbo.client password: BZ7qNxpVJAsAXEequQ
elbo.client ----------------------------------------------
elbo.client here are URLS for task logs ...
elbo.client setup logs - http://216.153.51.67/setup
elbo.client requirements logs - http://216.153.51.67/requirements
elbo.client task logs - http://216.153.51.67/task
elbo.client TIP: 💡 see task details with command: `elbo show 125`
elbo.client ⏳ It may take a minute or two for the node to be reachable.
elbo.client node started ..
elbo.client Notebook URL = http://216.153.51.67:8080/?token=5824d0cfbbc3ed1710969d4cfe8404c6dfdcc37e206d931d(.venv) joy@elbo ~/p/elbo-examples (main)> elbo run --config pytorch/mnist_classifier/elbo.yaml
elbo.client is starting 'Train MNIST Classifier' submission ...
elbo.client Hey Anu 👋, welcome!
elbo.client is uploading sources from pytorch/mnist_classifier/....
elbo.client upload successful.
elbo.client number of compute choices - 27
? Please choose: (Use arrow keys)
» $ 0.0028/hour Micro (for testing) 2 cpu 1Gb mem 0Gb gpu-mem AWS (spot)
$ 0.0150/hour Standard (for testing) 1 cpu 2Gb mem 0Gb gpu-mem Linode (~ 9 mins to provision)
$ 0.0770/hour Micro (for testing) 2 cpu 1Gb mem 0Gb gpu-mem AWS
$ 0.2700/hour Nvidia Tesla K80 4 cpu 61Gb mem 12Gb gpu-mem AWS (spot)
$ 0.7220/hour Nvidia A4000 2 cpu 4Gb mem 16Gb gpu-mem TensorDock
$ 0.9000/hour Nvidia Tesla K80 4 cpu 61Gb mem 12Gb gpu-mem AWS
$ 0.9180/hour Nvidia V100 8 cpu 61Gb mem 16Gb gpu-mem AWS (spot)
$ 0.9200/hour Nvidia Quadro 5000 2 cpu 4Gb mem 16Gb gpu-mem FluidStack
$ 0.9600/hour Nvidia A5000 2 cpu 16Gb mem 24Gb gpu-mem TensorDock
$ 1.4940/hour Nvidia A40 2 cpu 12Gb mem 48Gb gpu-mem TensorDock
$ 1.5000/hour Nvidia Quadro 6000 8 cpu 32Gb mem 0Gb gpu-mem Linode (~ 9 mins to provision)
$ 1.5140/hour Nvidia A6000 2 cpu 16Gb mem 48Gb gpu-mem TensorDock
$ 2.1600/hour 8x Nvidia Tesla K80 32 cpu 488Gb mem 12Gb gpu-mem AWS (spot)
$ 3.0000/hour 2x Nvidia Quadro 6000 16 cpu 64Gb mem 0Gb gpu-mem Linode (~ 9 mins to provision)
$ 3.0600/hour Nvidia V100 8 cpu 61Gb mem 16Gb gpu-mem AWS
$ 3.6720/hour 4x Nvidia V100 32 cpu 244Gb mem 16Gb gpu-mem AWS (spot)
$ 3.7460/hour 7x Nvidia V100 6 cpu 8Gb mem 16Gb gpu-mem TensorDock
$ 4.3200/hour 16x Nvidia Tesla K80 64 cpu 732Gb mem 12Gb gpu-mem AWS (spot)
$ 4.5000/hour 3x Nvidia Quadro 6000 20 cpu 96Gb mem 0Gb gpu-mem Linode (~ 9 mins to provision)
$ 6.0000/hour 4x Nvidia Quadro 6000 24 cpu 128Gb mem 0Gb gpu-mem Linode (~ 9 mins to provision)
$ 7.3440/hour 8x Nvidia V100 64 cpu 488Gb mem 16Gb gpu-mem AWS (spot)
$ 7.9200/hour 8x Nvidia Tesla K80 32 cpu 488Gb mem 12Gb gpu-mem AWS
$ 9.8318/hour 8x Nvidia A100 96 cpu 1152Gb mem 80Gb gpu-mem AWS (spot)
$13.0360/hour 4x Nvidia V100 32 cpu 244Gb mem 16Gb gpu-mem AWS.(.venv) joy@elbo ~/p/elbo-examples (main)> elbo kill 153
elbo.client Stopping task - 153
elbo.client Task with id=153 is marked for cancellation.(.venv) joy@elbo ~/p/elbo-examples (main)> elbo show 123
elbo.client Fetching task - 123
elbo.client Task with id = 123:
Billed Cost : 0.2100000
Billed Upto Time : 03/07/22 12:58
Bucket Key : [email protected]/elbo-archive-26b7975b.tgz
Completion Time : 03/07/22 12:58
Compute Type : FluidStack None Nvidia Quadro 4000x1(8Gb) CPU=2(4Gb) Cost=0.42 Cost/Transistor=0.028767123287671233 CUDA Cores=2304
Config File Path : None
Cost Per Hour : 0.4200000
Created Time : 03/07/22 12:27
Customer Billed : True
Instance ID : recbPXDkeuR3SBTV7
Instance Type : Dedicated
Keep Alive : True
Last Modified Time : 03/07/22 12:26
Name : Start a jupyter notebook
Password : ou4zebZ2XCoaMhdDrQ
Previous Task ID : None
Provider : FluidStack
Record ID : 185
Requirements Log Path : http://216.153.51.67/requirements
Run Time : 00h:31m:28s
SSH Only : False
Session ID : 6381c2c835e340f6957542720dee8d13
Setup Log Path : http://216.153.51.67/setup
Status : Archived
Submission Time : 03/07/22 12:26
Target File Path : [email protected]/elbo-6381c2c835e340f6957542720dee8d13-artifacts.tgz
Task ID : 123
Task Log Path : http://216.153.51.67/task
Total Cost : 0.2170000
User ID : [email protected]
ip : 216.153.51.67(.venv) joy@elbo ~/p/elbo-examples (main)> elbo download 159
elbo.client Downloading Artifacts for - 159
elbo.client Artifacts for task id = 159 downloaded to /var/folders/8f/vcfd13292kl6p93zxf1yypl40000gn/T/tmp8uetxm69/elbo-3dc59be0e9b545378b6a679345175f1c-artifacts.tgz(.venv) joy@elbo ~> elbo ps -r
elbo.client your running tasks:
+---------+--------------+---------------+----------+---------------------+---------+---------------------+------------------------+-----------------+----------+------------+
| Task ID | Compute Type | Cost Per Hour | Provider | Start Time | Status | Submission Time | Task Name | Completion Time | Run Time | Total Cost |
+---------+--------------+---------------+----------+---------------------+---------+---------------------+------------------------+-----------------+----------+------------+
| 153 | Economy GPU | 0.0028 | AWS | 02/04/2022 14:26 PM | Running | 02/04/2022 14:25 PM | SSH only session | | | |
| 155 | Economy GPU | 0.0028 | AWS | 02/04/2022 14:29 PM | Running | 02/04/2022 14:28 PM | SSH only session | | | |
| 156 | Economy GPU | 0.27 | AWS | 02/04/2022 14:31 PM | Running | 02/04/2022 14:29 PM | Train MNIST Classifier | | | |
| 158 | Economy GPU | 0.27 | AWS | 02/04/2022 15:05 PM | Running | 02/04/2022 15:04 PM | Train MNIST Classifier | | | |
| 159 | Economy GPU | 0.0028 | AWS | 02/04/2022 15:45 PM | Running | 02/04/2022 15:45 PM | Train MNIST Classifier | | | |
+---------+--------------+---------------+----------+---------------------+---------+---------------------+------------------------+-----------------+----------+------------+(.venv) joy@elbo ~> elbo ssh 159
elbo.client Trying to SSH into task 159...
elbo.client SSH:
elbo.client Running Command : ssh [email protected] -p 2222
elbo.client Enter this password when prompted: elbo
Warning: Permanently added '[44.234.188.107]:2222' (ED25519) to the list of known hosts.
[email protected]'s password:
Welcome to Ubuntu 18.04.6 LTS (GNU/Linux 5.4.0-1061-aws x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
This system has been minimized by removing packages and content that are
not required on a system that users do not log into.
To restore this content, you can run the 'unminimize' command.
The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.
root@6b63a53b9691:~#(.venv) joy@elbo ~> elbo status
elbo.client Membership: ✅
elbo.client Database : ✅
elbo.client Server : ✅class MNISTClassifier(ElboModel, nn.Module):
def get_artifacts_directory(self):
return 'artifacts'
def save_state(self):
model_path = os.path.join(self.get_artifacts_directory(), "mnist_model")
torch.save(self.state_dict(), model_path)
print(f"Saving model to {model_path}")
def load_state(self, state_dir):
model_path = os.path.join(self.get_artifacts_directory(), "mnist_model")
print(f"Loading model from {model_path}")
self.load_state_dict(torch.load(model_path))You can use ELBO Python API to automatically save the state of the training.
The ELBO configuration is specified in YAML in a configuration file. Let's look at its contents.
pandas
numpy
torch
pytorch_lightning
tqdm
torchvision
wandb #
# ELBO Sample Config File for MNIST Classifier Task
#
# All paths are relative to where the `elbo.yaml` file is placed
name: "Train MNIST Classifier"
# The GPU class to use - Economy, MidRange, HighEnd, All
gpu_class: Economy
# The script to run for setting up the environment. For example - installing packages
# on Ubuntu
setup: setup.sh
# The PIP requirements file. ELBO will install the requirements specified in this
# file before launching the task.
requirements: requirements.txt
# The main entry point in the task. Once the script exits or terminates, the task
# is considered complete.
run: main.py
# The task directory, relative to this file. This directory will be tar-balled and sent to ELBO task executor for
# execution
task_dir: .
# Artifacts directory. This is the directory that will be copied over as output. All model related files -
# checkpoints, generated samples, evaluation results etc. should be placed in this directory.
artifacts: ~/artifacts