Q:

pyspark take random sample

#if replacement=true to allow duplicate entries in the sample & false otherwise.
#0.5 = sample size
#5 =seed
df.sample(true, 0.5, 5)
1

New to Communities?

Join the community