Thursday, November 9, 2017

Space not getting reclaimed for MapR DB table

          Space not getting reclaimed for MapR DB table



When data is deleted based on TTL it will be deleted from the disk the next time the segment is packed and then the space will be. Unlike HBase, MapR-DB does not routinely rewrite all data , this approach reduces write amplification.  The side effect of this is that storage that should be released will not be released if the same segment is not being actively updated due to write patterns . To reclaim the space almost immediately you can force MapR-DB to pack segments using the "maprcli table region pack -path  <Table Name> -fid all" command. 

Below steps are will show example where i created a table with very short TTL but space is not reclaimed and solution to reclaim the Space quickly . 


1)  Description of table which was created with TTL 1 minute.

hbase(main):001:0> describe '/srctable'
Table /srctable is ENABLED                                                                                                                                                         
/srctable, {TABLE_ATTRIBUTES => {MAX_FILESIZE => '4294967296', METADATA => {'AUTOSPLIT' => 'true', 'MAPR_UUID' => '5de6339a-a352-bc68-3844-0ad8b6f85900', 'MAX_VALUE_SIZE_IN_MEM' =>
'100'}}                                                                                                                                                                            
COLUMN FAMILIES DESCRIPTION                                                                                                                                                        
{NAME => 'fam0', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => '60 SECONDS (1 MINUTE)', COMPRESSI
ON => 'LZ4', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '8192', REPLICATION_SCOPE => '0', METADATA => {'compression_raw' => '2'}}    

2)  Added 50 rows and recorded the size of the table.

[root@node107rhel72 ~]# /opt/mapr/server/tools/loadtest -table /srctable -numrows 50
23:01:49    0 secs        50 rows       50 rows/s    0ms latency    1ms maxLatency
Overall Rate 2941.18 rows/s, Latency 0ms

[root@node107rhel72 ~]# maprcli table info -path  /srctable -json
{
"timestamp":1509602514944,
"timeofday":"2017-11-01 11:01:54.944 GMT-0700",
"status":"OK",
"total":1,
"data":[
{
"path":"/srctable",
"numregions":1,
"totallogicalsize":90112,
"totalphysicalsize":81920,

"totalcopypendingsize":0,
"totalrows":50,
"totalnumberofspills":1,


}
]
}
3) Hbase does see 50 row which were inserted by the tool .

hbase(main):001:0> scan '/srctable'
ROW                                            COLUMN+CELL                                                                                                                         
 user1000385178204227360                       column=fam0:col_00, timestamp=1509602508622, value=ec-laeotropic-laeotropism-laeotropous-Laertes-laertes-Laertiades-Laestrygon-Laestry
                                               gones-Laestrygonians-laet-laetation-laeti-laetic-Laetitia-laetrile-laevigate-Laevigrada-laevo-laevo--laevoduction-laevogyrate-laevogyr
                                               e-laevogyrous-laevolactic-laevorotation-laevorotatory-laevotartaric-laevoversion-laevulin-laevulose-LaF-Lafarge-Lafargeville-Lafayette
                                               -lafayette-Lafca\x00                                                                                                                
 user1000385178204227360                       column=fam0:col_01, timestamp=1509602508622, value=ness-cathartics-Cathartidae-Cathartides-cathartin-Cathartolinum-Cathay-Cathayan-Cat
                                               he-cat-head-cathead-catheads-cathect-cathected-cathectic-cathecting-cathection-cathects-cathedra-cathedrae-cathedral-cathedraled-cathe
                                               dralesque-cathedralic-cathedral-like-cathedrallike-cathedrals-cathedralwise-cathedras-cathedrated-cathedratic-cathedratica-cathedratic
                                               al-cath\x00                                                                                                                         


 user4640687271668624146                       column=fam0:col_00, timestamp=1509602508622, value=ometer-urobenzoic-urobilin-urobilinemia-urobilinogen-urobilinogenuria-urobilinuria-

                                               urocanic-urocele-Urocerata-urocerid-Uroceridae-urochloralic-urochord-Urochorda-urochordal-urochordate-urochords-urochrome-urochromogen
                                               -urochs-Urocoptidae-Urocoptis-urocyanogen-Urocyon-urocyst-urocystic-Urocystis-urocystitis-urodaeum-Urodela-uro\x00                  
 user4640687271668624146                       column=fam0:col_01, timestamp=1509602508622, value=ability-inheritable-inheritableness-inheritably-inheritage-inheritance-inheritances
                                               -inherited-inheriting-inheritor-inheritors-inheritress-inheritresses-inheritrice-inheritrices-inheritrix-inherits-inherle-inhesion-inh
                                               esions-inhesive-inhiate-inhibit-inhibitable-inhibited-inhibiter-inhibiting-inhibition-inhibitionist-inhibitions-\x00                
 user4640687271668624146                       column=fam0:col_02, timestamp=1509602508622, value=d-vallancy-vallar-vallary-vallate-vallated-vallation-Valle-Valleau-Vallecito-Vallec
                                               itos-vallecula-valleculae-vallecular-valleculate-Vallejo-Vallenar-Vallery-Valletta-vallevarite-Valley-valley-valleyful-valleyite-valle
                                               ylet-valleylike-valleys-valleyward-valleywise-Valli-Valliant-vallicula-valliculae-vallicular-v\x00                                  
 user4876795174170569834                       column=fam0:col_00, timestamp=1509602508622, value=arp-biting-sharp-bottomed-sharp-breasted-sharp-clawed-sharp-cornered-sharp-cut-shar
                                               p-cutting-Sharpe-sharp-eared-sharped-sharp-edged-sharp-elbowed-sharpen-sharpened-sharpener-sharpeners-sharpening-sharpens-sharper-shar
                                               pers-Sharpes-sharpest-sharp-eye-sharp-eyed-sharp-eyes-sharp-faced-sharp-fanged-sharp-feat\x00                                       
  
 user764277275702281672                        column=fam0:col_00,          
.........................
.........................


 user9105318085603802964                       column=fam0:col_02, timestamp=1509602508622, value=er-Plutus-plutus-Pluvi-pluvial-pluvialiform-pluvialine-Pluvialis-pluvially-pluvials
                                               -pluvian-pluvine-pluviograph-pluviographic-pluviographical-pluviography-pluviometer-pluviometric-pluviometrical-pluviometrically-pluvi
                                               ometry-pluvioscope-pluvioscopic-Pluviose-pluviose-pluviosity-pluvious-Pluvius-ply-plyboard-plyer-plyers-plygain-plying-plyingly\x00 
50 row(s) in 0.3110 seconds

hbase(main):002:0> quit

4)  Now wait a minute and date will be marked for deletion due to TTL of the table being 1 minute and hence you cannot access the data anymore but size of the table won't be reduced.

hbase(main):001:0> scan '/srctable'
ROW                                            COLUMN+CELL                                                                                                                         
0 row(s) in 0.0450 seconds

[root@node107rhel72 ~]# maprcli table info -path  /srctable -json
{
"timestamp":1509602642299,
"timeofday":"2017-11-01 11:04:02.299 GMT-0700",
"status":"OK",
"total":1,
"data":[
{
"path":"/srctable",
"numregions":1,
"totallogicalsize":90112,
"totalphysicalsize":81920,
"totalcopypendingsize":0,
"totalrows":50,


}
]
}


5) Now run region pack on the table for size of the table to be reduced as expected.

[root@node107rhel72 ~]# maprcli table region pack -path  /srctable -fid all
[root@node107rhel72 ~]# maprcli table info -path  /srctable -json
{
"timestamp":1509602659308,
"timeofday":"2017-11-01 11:04:19.308 GMT-0700",
"status":"OK",
"total":1,
"data":[
{
"path":"/srctable",
"numregions":1,
"totallogicalsize":16384,
"totalphysicalsize":16384,

"totalcopypendingsize":0,
"totalrows":0,
 }
]
}
[root@node107rhel72 ~]#

Space reclaimed.



DEBUGGING :

There can be cases you have been told space isn't reclaimed even after running Pack .  The way to verify the data doesn't exist on disk is by running a raw scan .  If the data which was supposed to be reclaimed is still holding space it would show up in rawScan output .

[root@node107rhel72 ~]#  maprcli debugdb rawScan -fid 2445.32.131292 -startkey "--INFINITY" -dumpfile /tmp/rawScan.txt
[root@node107rhel72 ~]# cat /tmp/rawScan.txt
Row SpillFid FamilyDataLength EntryTime Column+Cell
user1000385178204227360 2445.54.131336 1084 1510286237058:0 column=fam0:col_00,type=put,timestamp=1510286237058,value=ec-laeotropic-laeotropism-laeotropous-Laertes-laertes-Laertiades-Laestrygon-Laestrygones-Laestrygonians-laet-laetation-laeti-laetic-Laetitia-laetrile-laevigate-Laevigrada-laevo-laevo--laevoduction-laevogyrate-laevogyre-laevogyrous-laevolactic-laevorotation-laevorotatory-laevotartaric-laevoversion-laevulin-laevulose-LaF-Lafarge-Lafargeville-Lafayette-lafayette-Lafca\x00
user1000385178204227360 2445.54.131336 1084 1510286237058:0 column=fam0:col_01,type=put,timestamp=1510286237058,value=ness-cathartics-Cathartidae-Cathartides-cathartin-Cathartolinum-Cathay-Cathayan-Cathe-cat-head-cathead-catheads-cathect-cathected-cathectic-cathecting-cathection-cathects-cathedra-cathedrae-cathedral-cathedraled-cathedralesque-cathedralic-cathedral-like-cathedrallike-cathedrals-cathedralwise-cathedras-cathedrated-cathedratic-cathedratica-cathedratical-cath\x00
user1000385178204227360 2445.54.131336 1084 1510286237058:0 column=fam0:col_02,type=put,timestamp=1510286237058,value=arian-unsectarianism-unsectarianize-unsectarianized-unsectarianizing-unsectional-unsectionalised-unsectionalized-unsectionally-unsectioned-unsecular-unsecularised-unsecularize-unsecularized-unsecularly-unsecurable-unsecurableness-unsecure-unsecured-unsecuredly-unsecuredness-unsecurely-unsecureness-unsecurity-\x00


Now after i run table pack i don't see any data which was existing .

[root@node107rhel72 ~]# maprcli table region pack -path  /srctable -fid all
[root@node107rhel72 ~]#  maprcli debugdb rawScan -fid 2445.32.131292 -startkey "--INFINITY" -dumpfile /tmp/rawScanlater.txt
[root@node107rhel72 ~]# cat /tmp/rawScanlater.txt
Row SpillFid FamilyDataLength EntryTime Column+Cell

[root@node107rhel72 ~]# 




No comments:

Post a Comment