Jika Anda perlu mengembalikan sampel kecil dokumen acak dari koleksi, berikut adalah tiga pendekatan yang dapat Anda coba menggunakan alur agregasi.
$sample
Panggung
$sample
tahap pipa agregasi dirancang khusus untuk secara acak memilih sejumlah dokumen tertentu.
Saat Anda menggunakan $sample
, Anda menentukan jumlah dokumen yang ingin Anda kembalikan dalam size
lapangan.
Misalkan kita memiliki koleksi berikut yang disebut pets
:
{ "_id" :1, "name" :"Wag", "type" :"Dog", "weight" :20 }{ "_id" :2, "name" :"Bark", "type" :"Anjing", "berat" :10 }{ "_id" :3, "name" :"Meow", "type" :"Cat", "weight" :7 }{ "_id" :4, "name" :"Scratch", "type" :"Cat", "weight" :8 }{ "_id" :5, "name" :"Bruce", "type" :"Bat", "weight" :3 }{ " _id" :6, "name" :"Hop", "type" :"Kangaroo", "weight" :130 }{ "_id" :7, "name" :"Punch", "type" :"Gorilla", "weight" :300 }{ "_id" :8, "name" :"Snap", "type" :"Crocodile", "weight" :400 }{ "_id" :9, "name" :"Flutter", "type" :"Hummingbird", "weight" :1 }
Kita bisa menggunakan $sample
untuk mengambil sampel acak dari dokumen-dokumen seperti ini:
db.pets.aggregate( [ { $sampel:{ ukuran:3 } } ])
Hasil:
{ "_id" :1, "name" :"Wag", "type" :"Dog", "weight" :20 }{ "_id" :5, "name" :"Bruce", "type" :"Bat", "weight" :3 }{ "_id" :3, "name" :"Meow", "type" :"Cat", "weight" :7 }
Dalam hal ini saya menentukan { size: 3 }
yang mengembalikan tiga dokumen.
Ini dia lagi menggunakan ukuran sampel yang berbeda:
db.pets.aggregate(
[
{
$sample: { size: 5 }
}
]
)
Hasil:
{ "_id" :6, "name" :"Hop", "type" :"Kangaroo", "weight" :130 }{ "_id" :5, "name" :"Bruce", "type" :"Bat", "weight" :3 }{ "_id" :8, "name" :"Snap", "type" :"Crocodile", "weight" :400 }{ "_id" :7, "name" :"Punch", "type" :"Gorilla", "weight" :300 }{ "_id" :4, "name" :"Scratch", "type" :"Cat", "weight" :8 }
$sample
tahap bekerja dalam salah satu dari dua cara, tergantung pada berapa banyak dokumen dalam koleksi, ukuran sampel relatif terhadap jumlah dokumen dalam koleksi, dan posisinya dalam pipa. Lihat MongoDB$sample
untuk penjelasan tentang cara kerjanya.Mungkin juga
$sample
stage dapat mengembalikan dokumen yang sama lebih dari satu kali dalam kumpulan hasil.
$rand
Operator
$rand
operator diperkenalkan di MongoDB 4.4.2, dan tujuannya adalah untuk mengembalikan float acak antara 0 dan 1 setiap kali dipanggil.Oleh karena itu, kita dapat menggunakannya di
$match
stage bersama dengan operator lain, seperti$expr
dan$lt
untuk mengembalikan sampel dokumen acak.Contoh:
db.pets.aggregate( [ { $match: { $expr: { $lt: [ 0.5, { $rand: {} } ] } } } ] )
Hasil:
{ "_id" :3, "name" :"Meow", "type" :"Cat", "weight" :7 }{ "_id" :4, "name" :"Scratch", "type" :"Cat", "weight" :8 }{ "_id" :6, "name" :"Hop", "type" :"Kangaroo", "weight" :130 }{ "_id" :9, "name" :"Flutter", "type" :"Hummingbird", "weight" :1 }Hasil yang ditetapkan dari pendekatan ini berbeda dengan
$sample
pendekatan, karena tidak mengembalikan sejumlah dokumen tetap. Jumlah dokumen yang dikembalikan dengan pendekatan ini dapat bervariasi.Misalnya, inilah yang terjadi ketika saya menjalankan kode yang sama beberapa kali lagi.
Kumpulan hasil 2:
{ "_id" :1, "name" :"Wag", "type" :"Dog", "weight" :20 }{ "_id" :7, "name" :"Punch", "type" :"Gorilla", "weight" :300 }{ "_id" :8, "name" :"Snap", "type" :"Crocodile", "weight" :400 }Kumpulan hasil 3:
{ "_id" :2, "name" :"Bark", "type" :"Anjing", "berat" :10 }{ "_id" :4, "name" :"Scratch", "type" :"Cat", "weight" :8 }{ "_id" :9, "name" :"Flutter", "type" :"Hummingbird", "weight" :1 }Kumpulan hasil 4:
{ "_id" :1, "name" :"Wag", "type" :"Dog", "weight" :20 }{ "_id" :3, "name" :"Meow", "type" :"Cat", "weight" :7 }{ "_id" :6, "name" :"Hop", "type" :"Kangaroo", "weight" :130 }{ "_id" :8, "name" :"Snap", "type" :"Buaya", "berat" :400 }Kumpulan hasil 5:
{ "_id" :1, "name" :"Wag", "type" :"Dog", "weight" :20 }{ "_id" :4, "name" :"Scratch", "type" :"Cat", "weight" :8 }{ "_id" :7, "name" :"Punch", "type" :"Gorilla", "weight" :300 }{ "_id" :8, "name" :"Snap", "type" :"Crocodile", "weight" :400 }{ "_id" :9, "name" :"Flutter", "type" :"Hummingbird", "weight" :1 }
$sampleRate
OperatorDiperkenalkan di MongoDB 4.4.2,
$sampleRate
operator menyediakan cara yang lebih ringkas untuk melakukan hal yang sama seperti contoh sebelumnya.Saat Anda menggunakan
$sampleRate
, Anda memberikan tingkat sampel sebagai angka floating point antara0
dan1
. Proses pemilihan menggunakan distribusi acak yang seragam, dan laju sampel yang Anda berikan mewakili probabilitas bahwa dokumen tertentu akan dipilih saat melewati jalur pipa.Contoh:
db.pets.aggregate( [ { $match: { $sampleRate: 0.5 } } ] )
Hasil:
{ "_id" :1, "name" :"Wag", "type" :"Dog", "weight" :20 }{ "_id" :2, "name" :"Bark", "type" :"Anjing", "berat" :10 }{ "_id" :5, "name" :"Bruce", "type" :"Bat", "weight" :3 }{ "_id" :6, "name" :"Hop", "type" :"Kangaroo", "weight" :130 }{ "_id" :7, "name" :"Punch", "type" :"Gorilla", "weight" :300 }{ " _id" :8, "name" :"Snap", "type" :"Buaya", "berat" :400 }Dan jalankan lagi:
{ "_id" :3, "name" :"Meow", "type" :"Cat", "weight" :7 }{ "_id" :4, "name" :"Scratch", "type" :"Cat", "weight" :8 }{ "_id" :7, "name" :"Punch", "type" :"Gorilla", "weight" :300 }{ "_id" :8, "name" :"Snap", "type" :"Crocodile", "weight" :400 }{ "_id" :9, "name" :"Flutter", "type" :"Hummingbird", "weight" :1 }Dan lagi:
{ "_id" :1, "name" :"Wag", "type" :"Dog", "weight" :20 }{ "_id" :2, "name" :"Bark", "type" :"Anjing", "berat" :10 }{ "_id" :3, "name" :"Meow", "type" :"Cat", "weight" :7 }{ "_id" :8, "name" :"Snap", "type" :"Buaya", "berat" :400 }