I’m working on a project that I am using Ray’s Data library to read JSONL files from one of my directories. The issue that arises is much of my JSONL files have utf-8 encoded symbols (some being French) and Ray seems to read and write these files with different encoding.
I’m wondering if there is a way I can alter the read_json(), and write_json() for them to ensure these unicode characters are preserved, and written back out in character form.
I did not see anyways in the documentation explicitly but I’m not sure if some kwargs might help with this or some other solutions exists.
Thanks.