Space not getting reclaimed for MapR DB table
When data is deleted based on TTL it will be deleted from the disk the next time the segment is packed and then the space will be. Unlike HBase, MapR-DB does not routinely rewrite all data , this approach reduces write amplification. The side effect of this is that storage that should be released will not be released if the same segment is not being actively updated due to write patterns . To reclaim the space almost immediately you can force MapR-DB to pack segments using the "maprcli table region pack -path <Table Name> -fid all" command.
Below steps are will show example where i created a table with very short TTL but space is not reclaimed and solution to reclaim the Space quickly .
1) Description of table which was created with TTL 1 minute.
hbase(main):001:0> describe '/srctable'
Table /srctable is ENABLED
/srctable, {TABLE_ATTRIBUTES => {MAX_FILESIZE => '4294967296', METADATA => {'AUTOSPLIT' => 'true', 'MAPR_UUID' => '5de6339a-a352-bc68-3844-0ad8b6f85900', 'MAX_VALUE_SIZE_IN_MEM' =>
'100'}}
COLUMN FAMILIES DESCRIPTION
{NAME => 'fam0', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => '60 SECONDS (1 MINUTE)', COMPRESSI
ON => 'LZ4', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '8192', REPLICATION_SCOPE => '0', METADATA => {'compression_raw' => '2'}}
2) Added 50 rows and recorded the size of the table.
[root@node107rhel72 ~]# /opt/mapr/server/tools/loadtest -table /srctable -numrows 50
23:01:49 0 secs 50 rows 50 rows/s 0ms latency 1ms maxLatency
Overall Rate 2941.18 rows/s, Latency 0ms
[root@node107rhel72 ~]# maprcli table info -path /srctable -json
{
"timestamp":1509602514944,
"timeofday":"2017-11-01 11:01:54.944 GMT-0700",
"status":"OK",
"total":1,
"data":[
{
"path":"/srctable",
"numregions":1,
"totallogicalsize":90112,
"totalphysicalsize":81920,
"totalcopypendingsize":0,
"totalrows":50,
"totalnumberofspills":1,
}
]
}
3) Hbase does see 50 row which were inserted by the tool .
hbase(main):001:0> scan '/srctable'
ROW COLUMN+CELL
user1000385178204227360 column=fam0:col_00, timestamp=1509602508622, value=ec-laeotropic-laeotropism-laeotropous-Laertes-laertes-Laertiades-Laestrygon-Laestry
gones-Laestrygonians-laet-laetation-laeti-laetic-Laetitia-laetrile-laevigate-Laevigrada-laevo-laevo--laevoduction-laevogyrate-laevogyr
e-laevogyrous-laevolactic-laevorotation-laevorotatory-laevotartaric-laevoversion-laevulin-laevulose-LaF-Lafarge-Lafargeville-Lafayette
-lafayette-Lafca\x00
user1000385178204227360 column=fam0:col_01, timestamp=1509602508622, value=ness-cathartics-Cathartidae-Cathartides-cathartin-Cathartolinum-Cathay-Cathayan-Cat
he-cat-head-cathead-catheads-cathect-cathected-cathectic-cathecting-cathection-cathects-cathedra-cathedrae-cathedral-cathedraled-cathe
dralesque-cathedralic-cathedral-like-cathedrallike-cathedrals-cathedralwise-cathedras-cathedrated-cathedratic-cathedratica-cathedratic
al-cath\x00
user4640687271668624146 column=fam0:col_00, timestamp=1509602508622, value=ometer-urobenzoic-urobilin-urobilinemia-urobilinogen-urobilinogenuria-urobilinuria-
urocanic-urocele-Urocerata-urocerid-Uroceridae-urochloralic-urochord-Urochorda-urochordal-urochordate-urochords-urochrome-urochromogen
-urochs-Urocoptidae-Urocoptis-urocyanogen-Urocyon-urocyst-urocystic-Urocystis-urocystitis-urodaeum-Urodela-uro\x00
user4640687271668624146 column=fam0:col_01, timestamp=1509602508622, value=ability-inheritable-inheritableness-inheritably-inheritage-inheritance-inheritances
-inherited-inheriting-inheritor-inheritors-inheritress-inheritresses-inheritrice-inheritrices-inheritrix-inherits-inherle-inhesion-inh
esions-inhesive-inhiate-inhibit-inhibitable-inhibited-inhibiter-inhibiting-inhibition-inhibitionist-inhibitions-\x00
user4640687271668624146 column=fam0:col_02, timestamp=1509602508622, value=d-vallancy-vallar-vallary-vallate-vallated-vallation-Valle-Valleau-Vallecito-Vallec
itos-vallecula-valleculae-vallecular-valleculate-Vallejo-Vallenar-Vallery-Valletta-vallevarite-Valley-valley-valleyful-valleyite-valle
ylet-valleylike-valleys-valleyward-valleywise-Valli-Valliant-vallicula-valliculae-vallicular-v\x00
user4876795174170569834 column=fam0:col_00, timestamp=1509602508622, value=arp-biting-sharp-bottomed-sharp-breasted-sharp-clawed-sharp-cornered-sharp-cut-shar
p-cutting-Sharpe-sharp-eared-sharped-sharp-edged-sharp-elbowed-sharpen-sharpened-sharpener-sharpeners-sharpening-sharpens-sharper-shar
pers-Sharpes-sharpest-sharp-eye-sharp-eyed-sharp-eyes-sharp-faced-sharp-fanged-sharp-feat\x00
user764277275702281672 column=fam0:col_00,
.........................
.........................
user9105318085603802964 column=fam0:col_02, timestamp=1509602508622, value=er-Plutus-plutus-Pluvi-pluvial-pluvialiform-pluvialine-Pluvialis-pluvially-pluvials
-pluvian-pluvine-pluviograph-pluviographic-pluviographical-pluviography-pluviometer-pluviometric-pluviometrical-pluviometrically-pluvi
ometry-pluvioscope-pluvioscopic-Pluviose-pluviose-pluviosity-pluvious-Pluvius-ply-plyboard-plyer-plyers-plygain-plying-plyingly\x00
50 row(s) in 0.3110 seconds
hbase(main):002:0> quit
4) Now wait a minute and date will be marked for deletion due to TTL of the table being 1 minute and hence you cannot access the data anymore but size of the table won't be reduced.
hbase(main):001:0> scan '/srctable'
ROW COLUMN+CELL
0 row(s) in 0.0450 seconds
[root@node107rhel72 ~]# maprcli table info -path /srctable -json
{
"timestamp":1509602642299,
"timeofday":"2017-11-01 11:04:02.299 GMT-0700",
"status":"OK",
"total":1,
"data":[
{
"path":"/srctable",
"numregions":1,
"totallogicalsize":90112,
"totalphysicalsize":81920,
"totalcopypendingsize":0,
"totalrows":50,
}
]
}
5) Now run region pack on the table for size of the table to be reduced as expected.
[root@node107rhel72 ~]# maprcli table region pack -path /srctable -fid all
[root@node107rhel72 ~]# maprcli table info -path /srctable -json
{
"timestamp":1509602659308,
"timeofday":"2017-11-01 11:04:19.308 GMT-0700",
"status":"OK",
"total":1,
"data":[
{
"path":"/srctable",
"numregions":1,
"totallogicalsize":16384,
"totalphysicalsize":16384,
"totalcopypendingsize":0,
"totalrows":0,
}
]
}
[root@node107rhel72 ~]#
Space reclaimed.
DEBUGGING :
There can be cases you have been told space isn't reclaimed even after running Pack . The way to verify the data doesn't exist on disk is by running a raw scan . If the data which was supposed to be reclaimed is still holding space it would show up in rawScan output .
[root@node107rhel72 ~]# maprcli debugdb rawScan -fid 2445.32.131292 -startkey "--INFINITY" -dumpfile /tmp/rawScan.txt
[root@node107rhel72 ~]# cat /tmp/rawScan.txt
Row SpillFid FamilyDataLength EntryTime Column+Cell
user1000385178204227360 2445.54.131336 1084 1510286237058:0 column=fam0:col_00,type=put,timestamp=1510286237058,value=ec-laeotropic-laeotropism-laeotropous-Laertes-laertes-Laertiades-Laestrygon-Laestrygones-Laestrygonians-laet-laetation-laeti-laetic-Laetitia-laetrile-laevigate-Laevigrada-laevo-laevo--laevoduction-laevogyrate-laevogyre-laevogyrous-laevolactic-laevorotation-laevorotatory-laevotartaric-laevoversion-laevulin-laevulose-LaF-Lafarge-Lafargeville-Lafayette-lafayette-Lafca\x00
user1000385178204227360 2445.54.131336 1084 1510286237058:0 column=fam0:col_01,type=put,timestamp=1510286237058,value=ness-cathartics-Cathartidae-Cathartides-cathartin-Cathartolinum-Cathay-Cathayan-Cathe-cat-head-cathead-catheads-cathect-cathected-cathectic-cathecting-cathection-cathects-cathedra-cathedrae-cathedral-cathedraled-cathedralesque-cathedralic-cathedral-like-cathedrallike-cathedrals-cathedralwise-cathedras-cathedrated-cathedratic-cathedratica-cathedratical-cath\x00
user1000385178204227360 2445.54.131336 1084 1510286237058:0 column=fam0:col_02,type=put,timestamp=1510286237058,value=arian-unsectarianism-unsectarianize-unsectarianized-unsectarianizing-unsectional-unsectionalised-unsectionalized-unsectionally-unsectioned-unsecular-unsecularised-unsecularize-unsecularized-unsecularly-unsecurable-unsecurableness-unsecure-unsecured-unsecuredly-unsecuredness-unsecurely-unsecureness-unsecurity-\x00
Now after i run table pack i don't see any data which was existing .
[root@node107rhel72 ~]# maprcli table region pack -path /srctable -fid all
[root@node107rhel72 ~]# maprcli debugdb rawScan -fid 2445.32.131292 -startkey "--INFINITY" -dumpfile /tmp/rawScanlater.txt
[root@node107rhel72 ~]# cat /tmp/rawScanlater.txt
Row SpillFid FamilyDataLength EntryTime Column+Cell
[root@node107rhel72 ~]#
No comments:
Post a Comment