Notice

This document is for a development version of Ceph.

Lua Scripting 

New in version Pacific.

This feature allows users to assign execution context to Lua scripts. The supported contexts are:

prerequest which will execute a script before each operation is performed

postauth which will execute a script after each operation is authorized but before it is performed

postrequest which will execute after each operation is performed

background which will execute within a specified time interval

getdata which will execute on objects’ data when objects are downloaded

putdata which will execute on objects’ data when objects are uploaded

A request (pre or post) or data (get or put) context script may be constrained to operations belonging to a specific tenant’s users. The request context script can also access fields in the request and modify certain fields, as well as the Global RGW Table. The data context script can access the content of the object as well as the request fields and the Global RGW Table. All Lua language features can be used in all contexts. An execution of a script in a context can use up to 128K byte of memory. This include all libraries used by Lua, but not the memory which is managed by the RGW itself, and may be accessed from Lua. To change this default value, use the rgw_lua_max_memory_per_state configuration parameter. Note that the basic overhead of Lua with its standard libraries is ~32K bytes. To disable the limit, use zero. By default, the execution of a Lua script is limited to a maximum runtime of 1000 milliseconds. This limit can be changed using the rgw_lua_max_runtime_per_state configuration parameter. If a Lua script exceeds this runtime, it will be terminated. To disable the runtime limit, use zero.

Warning

Be cautious when modifying the memory limit. If the current memory usage exceeds the newly set limit, all previously stored data in the background state will be lost.

Warning

Disabling the runtime limit may result in unbounded script execution, which can lead to excessive resource consumption and potentially impact the RADOS Gateway’s availability.

By default, all Lua standard libraries are available in the script, however, in order to allow for additional Lua modules to be used in the script, we support adding packages to an allowlist:

Make sure that the luarocks package manager is installed on the host.
Adding a Lua package to the allowlist, or removing a package from it does not install or remove it. For the changes to take effect a “reload” command should be called.
In addition all packages in the allowlist are being re-installed using the luarocks package manager on radosgw restart.
To add a package that contains C source code that needs to be compiled, use the --allow-compilation flag. In this case a C compiler needs to be available on the host
Lua packages are installed in, and used from, a directory local to the radosgw. Meaning that Lua packages in the allowlist are separated from any Lua packages available on the host. By default, this directory would be /tmp/luarocks/<entity name>. Its prefix part (/tmp/luarocks/) could be set to a different location via the rgw_luarocks_location configuration parameter. Note that this parameter should not be set to one of the default locations where luarocks install packages (e.g. $HOME/.luarocks, /usr/lib64/lua, /usr/share/lua).

Script Management via CLI 

To upload a script:

# radosgw-admin script put --infile={lua-file-path} --context={prerequest|postauth|postrequest|background|getdata|putdata} [--tenant={tenant-name}]

When uploading a script with the background context, a tenant name should not be specified.

# cephadm shell radosgw-admin script put --infile=/rootfs/{lua-file-path} --context={prerequest|postrequest|background|getdata|putdata} [--tenant={tenant-name}]

To print the content of the script to standard output:

# radosgw-admin script get --context={preRequest|postAuth|postRequest|background|getdata|putdata} [--tenant={tenant-name}]

To remove the script:

# radosgw-admin script rm --context={preRequest|postAuth|postRequest|background|getdata|putdata} [--tenant={tenant-name}]

Package Management via CLI 

To add a package to the allowlist:

# radosgw-admin script-package add --package={package name} [--allow-compilation]

To add a specific version of a package to the allowlist:

# radosgw-admin script-package add --package='{package name} {package version}' [--allow-compilation]

When adding a different version of a package which already exists in the list, the newly added version will override the existing one.
When adding a package without a version specified, the latest version of the package will be added.

To remove a package from the allowlist:

# radosgw-admin script-package rm --package={package name}

To remove a specific version of a package from the allowlist:

# radosgw-admin script-package rm --package='{package name} {package version}'

When removing a package without a version specified, any existing versions of the package will be removed.

To print the list of packages in the allowlist:

# radosgw-admin script-package list

To apply changes from the allowlist to all RGWs:

# radosgw-admin script-package reload

Context Free Functions 

Debug Log 

The RGWDebugLog() function accepts a string and prints it to the debug log with priority 20. Each log message is prefixed Lua INFO:. This function has no return value.

Request Fields 

Warning

This feature is experimental. Fields may be removed or renamed in the future.

Note

Although Lua is a case-sensitive language, field names provided by the radosgw are case-insensitive. Function names remain case-sensitive.
Fields marked “optional” can have a nil value.
Fields marked as “iterable” can be used by the pairs() function and with the # length operator.
All table fields can be used with the bracket operator [].
time fields are strings with the following format: %Y-%m-%d %H:%M:%S.

Field	Type	Description	Iterable	Writable	Optional
`Request.RGWOp`	string	radosgw operation	no	no	no
`Request.DecodedURI`	string	decoded URI	no	no	no
`Request.ContentLength`	integer	size of the request	no	no	no
`Request.GenericAttributes`	table	string to string generic attributes map	yes	no	no
`Request.Response`	table	response to the request	no	no	no
`Request.Response.HTTPStatusCode`	integer	HTTP status code	no	yes	no
`Request.Response.HTTPStatus`	string	HTTP status text	no	yes	no
`Request.Response.RGWCode`	integer	radosgw error code	no	yes	no
`Request.Response.Message`	string	response message	no	yes	no
`Request.SwiftAccountName`	string	swift account name	no	no	yes
`Request.Bucket`	table	info on the bucket	no	no	no
`Request.Bucket.Tenant`	string	tenant of the bucket	no	no	yes
`Request.Bucket.Name`	string	bucket name (writable only in `prerequest` context)	no	yes	no
`Request.Bucket.Marker`	string	bucket marker (initial id)	no	no	yes
`Request.Bucket.Id`	string	bucket id	no	no	yes
`Request.Bucket.ZoneGroupId`	string	zone group of the bucket	no	no	yes
`Request.Bucket.CreationTime`	time	creation time of the bucket	no	no	yes
`Request.Bucket.MTime`	time	modification time of the bucket	no	no	yes
`Request.Bucket.Quota`	table	bucket quota	no	no	yes
`Request.Bucket.Quota.MaxSize`	integer	bucket quota max size	no	no	no
`Request.Bucket.Quota.MaxObjects`	integer	bucket quota max number of objects	no	no	no
`Request.Bucket.Quota.Enabled`	boolean	bucket quota is enabled	no	no	no
`Request.Bucket.Quota.Rounded`	boolean	bucket quota is rounded to 4K	no	no	no
`Request.Bucket.Tags`	table	bucket tags	no	no	yes
`Request.Bucket.PlacementRule`	table	bucket placement rule	no	no	yes
`Request.Bucket.PlacementRule.Name`	string	bucket placement rule name	no	no	no
`Request.Bucket.PlacementRule.StorageClass`	string	bucket placement rule storage class	no	no	no
`Request.Bucket.User`	string	owning user/account id	no	no	yes
`Request.Object`	table	info on the object	no	no	yes
`Request.Object.Name`	string	object name	no	no	no
`Request.Object.Instance`	string	object version	no	no	no
`Request.Object.Id`	string	object id	no	no	no
`Request.Object.Size`	integer	object size	no	no	no
`Request.Object.MTime`	time	object mtime	no	no	no
`Request.CopyFrom`	table	information on copy operation	no	no	yes
`Request.CopyFrom.Tenant`	string	tenant of the object copied from	no	no	no
`Request.CopyFrom.Bucket`	string	bucket of the object copied from	no	no	no
`Request.CopyFrom.Object`	table	object copied from. See: `Request.Object`	no	no	yes
`Request.ObjectOwner`	table	object owner	no	no	no
`Request.ObjectOwner.DisplayName`	string	object owner display name	no	no	no
`Request.ObjectOwner.User`	string	owning user/account id. See: `Request.Bucket.User`	no	no	yes
`Request.ZoneGroup.Name`	string	name of zone group	no	no	no
`Request.ZoneGroup.Endpoint`	string	endpoint of zone group	no	no	no
`Request.UserAcl`	table	user ACL	no	no	no
`Request.UserAcl.Owner`	table	user ACL owner. See: `Request.ObjectOwner`	no	no	no
`Request.UserAcl.Grants`	table	user ACL map of string to grant note: grants without an Id are not presented when iterated and only one of them can be accessed via brackets	yes	no	no
`Request.UserAcl.Grants["<name>"]`	table	user ACL grant	no	no	no
`Request.UserAcl.Grants["<name>"].Type`	integer	user ACL grant type	no	no	no
`Request.UserAcl.Grants["<name>"].User`	string	user ACL grant user/account id	no	no	no
`Request.UserAcl.Grants["<name>"].GroupType`	integer	user ACL grant group type	no	no	yes
`Request.UserAcl.Grants["<name>"].Referer`	string	user ACL grant referer	no	no	yes
`Request.BucketAcl`	table	bucket ACL. See: `Request.UserAcl`	no	no	no
`Request.ObjectAcl`	table	object ACL. See: `Request.UserAcl`	no	no	no
`Request.Environment`	table	string to string environment map	yes	no	no
`Request.Policy`	table	policy	no	no	yes
`Request.Policy.Text`	string	policy text	no	no	no
`Request.Policy.Id`	string	policy Id	no	no	yes
`Request.Policy.Statements`	table	list of string statements	yes	no	no
`Request.UserPolicies`	table	list of user policies	yes	no	no
`Request.UserPolicies[<index>]`	table	user policy. See: `Request.Policy`	no	no	no
`Request.RGWId`	string	radosgw host id: `<host>-<zone>-<zonegroup>`	no	no	no
`Request.HTTP`	table	HTTP header	no	no	no
`Request.HTTP.Parameters`	table	string to string parameter map	yes	no	no
`Request.HTTP.Resources`	table	string to string resource map	yes	no	no
`Request.HTTP.Metadata`	table	string to string metadata map	yes	yes	no
`Request.HTTP.StorageClass`	string	storage class	no	yes	yes
`Request.HTTP.Host`	string	host name	no	no	no
`Request.HTTP.Method`	string	HTTP method	no	no	no
`Request.HTTP.URI`	string	URI	no	no	no
`Request.HTTP.QueryString`	string	HTTP query string	no	no	no
`Request.HTTP.Domain`	string	domain name	no	no	no
`Request.Time`	time	request time	no	no	no
`Request.Dialect`	string	“S3” or “Swift”	no	no	no
`Request.Id`	string	request Id	no	no	no
`Request.TransactionId`	string	transaction Id	no	no	no
`Request.Tags`	table	object tags map	yes	no	no
`Request.User`	table	user that triggered the request	no	no	no
`Request.User.Tenant`	string	triggering user tenant	no	no	no
`Request.User.Id`	string	triggering user id	no	no	no
`Request.Trace`	table	info on trace	no	no	no
`Request.Trace.Enable`	boolean	tracing is enabled	no	yes	no

Request Functions 

Operations Log 

The Request.Log() function prints the requests into the operations log. This function has no parameters. It returns 0 for success and an error code if it fails.

Tracing 

Tracing functions can be used only in the postrequest context.

Request.Trace.SetAttribute(<key>, <value>) - sets the attribute for the request’s trace. The function takes two arguments: the first is the key, which should be a string, and the second is the value, which can either be a string or a number (integer or double). You may then locate specific traces by using this attribute.
Request.Trace.AddEvent(<name>, <attributes>) - adds an event to the first span of the request’s trace An event is defined by event name, event time, and zero or more event attributes. The function accepts one or two arguments: A string containing the event name should be the first argument, followed by the event attributes, which is optional for events without attributes. An event’s attributes must be a table of strings.

Request Blocking and Error Handling 

Script Execution Errors 

If the Lua script fails with a syntax or runtime error, RGW will log the error. The request that triggered the script will still go through.

Request Blocking and Return Values 

The script’s return value determines how RGW proceeds with the request: - To block the request: The script must return the value RGW_ABORT_REQUEST. RGW interprets this as -EPERM and will stop processing the request. - To continue the request: No return value, or any other return value or type will be treated as success.

Return Value Context 

The Lua script’s return value is evaluated only during the prerequest and postauth context and is ignored in any other RGW request-processing context. The HTTP response status code is 403 (Forbidden) by default when a request is blocked by Lua. The response code can be changed using Request.Response.HTTPStatusCode and Request.Response.HTTPStatus. If a request is aborted this way, the data and postrequest context will also be aborted.

Background Context 

The background context may be used for purposes that include analytics, monitoring, caching data for other context executions.

Background script execution default interval is 5 seconds.

Data Context 

Both getdata and putdata contexts have the following fields:

Data which is read-only and iterable (byte by byte). In case that an object is uploaded or retrieved in multiple chunks, the Data field will hold data of one chunk at a time.
Offset which is holding the offset of the chunk within the entire object.
The Request fields and the background RGW table are also available in these contexts.

Global RGW Table 

The RGW Lua table is accessible from all contexts and saves data written to it during execution so that it may be read and used later during other executions, from the same context or a different one.

Each RGW instance has its own private and ephemeral RGW Lua table that is lost when the daemon restarts. Note that background context scripts will run on every instance.
The maximum number of entries in the table is 100,000. Each entry has a string key a value with a combined length of no more than 1KB. A Lua script will abort with an error if the number of entries or entry size exceeds these limits.
The RGW Lua table uses string indices and can store values of type: string, integer, double and boolean

Increment/Decrement Functions 

Since entries in the RGW table could be accessed from multiple places at the same time we need a way to atomically increment and decrement numeric values in it. For that the following functions should be used: - RGW.increment(<key>, [value]) would increment the value of key by value if value is provided or by 1 if not - RGW.decrement(<key>, [value]) would decrement the value of key by value if value is provided or by 1 if not - if the value of key is not numeric, the execution of the script would fail - if we try to increment or decrement by non-numeric values, the execution of the script would fail

Lua Code Samples 

Print information on source and destination objects in case of copy:

function print_object(object)
  RGWDebugLog("  Name: " .. object.Name)
  RGWDebugLog("  Instance: " .. object.Instance)
  RGWDebugLog("  Id: " .. object.Id)
  RGWDebugLog("  Size: " .. object.Size)
  RGWDebugLog("  MTime: " .. object.MTime)
end

if Request.CopyFrom and Request.Object and Request.CopyFrom.Object then
  RGWDebugLog("copy from object:")
  print_object(Request.CopyFrom.Object)
  RGWDebugLog("to object:")
  print_object(Request.Object)
end

Print ACLs via a “generic function”:

function print_owner(owner)
  RGWDebugLog("Owner:")
  RGWDebugLog("  Display Name: " .. owner.DisplayName)
  RGWDebugLog("  Id: " .. owner.User.Id)
  RGWDebugLog("  Tenant: " .. owner.User.Tenant)
end

function print_acl(acl_type)
  index = acl_type .. "ACL"
  acl = Request[index]
  if acl then
    RGWDebugLog(acl_type .. "ACL Owner")
    print_owner(acl.Owner)
    RGWDebugLog("  there are " .. #acl.Grants .. " grant for owner")
    for k,v in pairs(acl.Grants) do
      RGWDebugLog("    Grant Key: " .. k)
      RGWDebugLog("    Grant Type: " .. v.Type)
      RGWDebugLog("    Grant Group Type: " .. v.GroupType)
      RGWDebugLog("    Grant Referer: " .. v.Referer)
      RGWDebugLog("    Grant User Tenant: " .. v.User.Tenant)
      RGWDebugLog("    Grant User Id: " .. v.User.Id)
    end
  else
    RGWDebugLog("no " .. acl_type .. " ACL in request: " .. Request.Id)
  end
end

print_acl("User")
print_acl("Bucket")
print_acl("Object")

Use of operations log only in case of errors:

if Request.Response.HTTPStatusCode ~= 200 then
  RGWDebugLog("request is bad, use ops log")
  rc = Request.Log()
  RGWDebugLog("ops log return code: " .. rc)
end

Set values into the error message:

if Request.Response.HTTPStatusCode == 500 then
  Request.Response.Message = "<Message> something bad happened :-( </Message>"
end

Add metadata to objects that was not originally sent by the client:

In the prerequest context we should add:

if Request.RGWOp == 'put_obj' then
  Request.HTTP.Metadata["x-amz-meta-mydata"] = "my value"
end

In the postrequest context we look at the metadata:

RGWDebugLog("number of metadata entries is: " .. #Request.HTTP.Metadata)
for k, v in pairs(Request.HTTP.Metadata) do
  RGWDebugLog("key=" .. k .. ", " .. "value=" .. v)
end

Use modules to create Unix socket based, JSON encoded, “access log”:

First we should add the following packages to the allowlist:

# radosgw-admin script-package add --package=lua-cjson --allow-compilation
# radosgw-admin script-package add --package=luasocket --allow-compilation

Then, run a server to listen on the Unix socket. For example, use “netcat”:

# rm -f /tmp/socket
# nc -vklU /tmp/socket

And last, do a restart for the radosgw and upload the following script to the postrequest context:

if Request.RGWOp == "get_obj" then
  local json = require("cjson")
  local socket = require("socket")
  local unix = require("socket.unix")
  local s = assert(unix())
  E = {}

  msg = {bucket = (Request.Bucket or (Request.CopyFrom or E).Bucket).Name,
    time = Request.Time,
    operation = Request.RGWOp,
    http_status = Request.Response.HTTPStatusCode,
    error_code = Request.Response.HTTPStatus,
    object_size = Request.Object.Size,
    trans_id = Request.TransactionId}

  assert(s:connect("/tmp/socket"))
  assert(s:send(json.encode(msg).."\n"))
  assert(s:close())
end

Trace only requests of specific bucket

Tracing is disabled by default, so we should enable tracing for this specific bucket

if Request.Bucket.Name == "my-bucket" then
    Request.Trace.Enable = true
end

If tracing is enabled on the RGW, the value of Request.Trace.Enable is true, so we should disable tracing for all other requests that do not match the bucket name. In the prerequest context:

if Request.Bucket.Name ~= "my-bucket" then
    Request.Trace.Enable = false
end

Note that changing Request.Trace.Enable does not change the tracer’s state, but disables or enables the tracing for the request only.

Add Information for requests traces

in postrequest context, we can add attributes and events to the request’s trace.

Request.Trace.AddEvent("lua script execution started")

Request.Trace.SetAttribute("HTTPStatusCode", Request.Response.HTTPStatusCode)

event_attrs = {}
for k,v in pairs(Request.GenericAttributes) do
  event_attrs[k] = v
end

Request.Trace.AddEvent("second event", event_attrs)

The entropy value of an object could be used to detect whether the object is encrypted. The following script calculates the entropy and size of uploaded objects and prints to debug log.

Add the following script in the putdata context:

function object_entropy()
        local byte_hist = {}
        local byte_hist_size = 256
        for i = 1,byte_hist_size do
                byte_hist[i] = 0
        end
        local total = 0

        for i, c in pairs(Data)  do
                local byte = c:byte() + 1
                byte_hist[byte] = byte_hist[byte] + 1
                total = total + 1
        end

        entropy = 0

        for _, count in ipairs(byte_hist) do
                if count ~= 0 then
                        local p = 1.0 * count / total
                        entropy = entropy - (p * math.log(p)/math.log(byte_hist_size))
                end
        end

        return entropy
end

local full_name = Request.Bucket.Name.."\\"..Request.Object.Name
RGWDebugLog("entropy of chunk of: " .. full_name .. " at offset:" .. tostring(Offset)  ..  " is: " .. tostring(object_entropy()))
RGWDebugLog("payload size of chunk of: " .. full_name .. " is: " .. #Data)

Brought to you by the Ceph Foundation

The Ceph Documentation is a community resource funded and hosted by the non-profit Ceph Foundation. If you would like to support this and our other efforts, please consider joining now.