![]() We highly recommend that you create an isolated virtual environment locally first, so that the move to a distributed virtualenv will be more smooth. Sc.parallelize(range(1,10)).map(lambda x : np._version_).collect() Using virtualenv in the Local Environmentįirst we will create a virtual environment in the local environment. We save the code in a file named spark_virtualenv.py.įrom pyspark import SparkContext if _name_ = "_main_": This piece of code uses numpy in each map function. In this example we will use the following piece of code. Batch modeįor batch mode, I will follow the pattern of first developing the example in a local environment, and then moving it to a distributed environment, so that you can follow the same pattern for your development. In HDP 2.6 we support batch mode, but this post also includes a preview of interactive mode.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |