Error msg when running dask example

When just blindly following the instructions here, to execute the example in the ‘scheduler section’, i’m getting the following error msg:

Traceback (most recent call last):
  File "/home/ray/", line 24, in <module>
    df = dd.from_pandas(pd.DataFrame(np.random.randint(0, 100, size=(1024, 2)), columns=["age", "grade"]))
  File "/home/ray/anaconda3/lib/python3.7/site-packages/dask/dataframe/io/", line 208, in from_pandas
    raise ValueError("Exactly one of npartitions and chunksize must be specified.")
ValueError: Exactly one of npartitions and chunksize must be specified.
command terminated with exit code 1

cc @Stephanie_Wang @Clark_Zinzow

I have a PR that’s been open for a few weeks that fixes the examples in the docs, but we got hung up on getting the example to run in the CI via Sphinx, which turned out to be less trivial than when I’ve done it in the past. @sangcho since I don’t have the bandwidth to get into Sphinx stuff right now, can we merge that PR so the examples are fixed in the docs and make another task for running these examples (and others) in the CI?

The PR is merged. Please add a task to backlog so that you won’t forget about it! Also @mbehrendt please check the PR to see what code needs to be modified :)! Sorry for the hustle!

@sangcho can do! Also @mbehrendt the master docs have been updated and the Dask-on-Ray examples should now run correctly: Dask on Ray — Ray v2.0.0.dev0

@sangcho I added an issue for doing this for all of the data processing example code, in one sweep. [Core - docs] Run data processing examples in CI · Issue #14769 · ray-project/ray · GitHub

great – thank you! I validated it and it worked