CipherSweet
Cross-platform, searchable field-level database encryptionBasic CipherSweet Usage
Once you have an engine in play, you can start defining encrypted fields and defining one or more blind index to be used for fast search operations.
Enhanced AAD
Since version 4.7, CipherSweet has an enhanced AAD class that allows you to bind an encrypted field to multiple other plaintext fialds and/or some literal string arguments.
<?php
use ParagonIE\CipherSweet\AAD;
$aad = new AAD(['field_1', 'field_2'], ['literal_value_to_authenticate']);
/* AAD is backwards compatible with the old API for EncryptedField: */
$aad = AAD::field('foo');
var_dump($aad->canonicalize(['foo' => 'example']));
// string(7) "example"
$aad = AAD::literal('bar');
var_dump($aad->canonicalize());
// string(3) "bar"
Anywhere that an AAD argument is accepted, you can either pass a string (for the old behavior) or the new AAD
class
(for advanced workflows).
You can also combine AAD values like so:
$aad = AAD::field('foo');
$combined = $aad->merge(AAD::field('bar'));
var_dump($combined->getFieldNames()); // ["bar", "foo"]
// Notice: merge() doesn't mutate the original AAD object, but returns a new one:
var_dump($aad->getFieldNames()); // ["foo"]
EncryptedField
This will primarily involve the EncryptedField
class (as well as one or more
instances of BlindIndex
), mostly:
-
$encryptedField->prepareForStorage()
-
$encryptedField->getBlindIndex()
-
$encryptedField->getAllBlindIndexes()
-
$encryptedField->encryptValue()
-
$encryptedField->decryptValue()
true
to the fourth argument.FastBlindIndex
and FastCompoundIndex
objects instead, which default to faster blind index calculations.For example, the following code encrypts a user's social security number and then creates two blind indexes: One for a literal search, the other only matches the last 4 digits.
<?php
use ParagonIE\CipherSweet\BlindIndex;
use ParagonIE\CipherSweet\CipherSweet;
use ParagonIE\CipherSweet\EncryptedField;
use ParagonIE\CipherSweet\Transformation\LastFourDigits;
/** @var CipherSweet $engine */
$ssn = (new EncryptedField($engine, 'contacts', 'ssn'))
// Add a blind index for the "last 4 of SSN":
->addBlindIndex(
new BlindIndex(
// Name (used in key splitting):
'contact_ssn_last_four',
// List of Transforms:
[new LastFourDigits()],
// Bloom filter size (bits)
16,
// Fast hash (default: false)
false
)
)
// Add a blind index for the full SSN:
->addBlindIndex(
new BlindIndex(
'contact_ssn',
[],
32
)
);
// Some example parameters:
$contactInfo = [
'name' => 'John Smith',
'ssn' => '123-45-6789',
'email' => 'foo@example.com'
];
/**
* @var string $ciphertext
* @var array<string, array<string, string>> $indexes
*/
list ($ciphertext, $indexes) = $ssn->prepareForStorage($contactInfo['ssn']);
Every time you run the above code, the $ciphertext
will be randomized, but the
array of blind indexes will remain the same.
var_dump($ciphertext, $indexes);
/*
string(73) "nacl:jIRj08YiifK86YlMBfulWXbatpowNYf4_vgjultNT1Tnx2XH9ecs1TqD59MPs67Dp3ui"
array(2) {
["contact_ssn_last_four"]=>
string(4) "2acb"
["contact_ssn"]=>
string(8) "311314c1"
}
*/
If you want the old "typed" index style, simply call setTypedIndexes(true)
on any
EncryptedField
, EncryptedRow
or EncryptedMultiRows
object.
$ssn->setTypedIndexes(true);
/**
* @var string $ciphertext
* @var array<string, string> $indexes
*/
list ($ciphertext, $indexes) = $ssn->prepareForStorage($contactInfo['ssn']);
var_dump($ciphertext, $indexes);
/*
string(73) "nacl:jIRj08YiifK86YlMBfulWXbatpowNYf4_vgjultNT1Tnx2XH9ecs1TqD59MPs67Dp3ui"
array(2) {
["contact_ssn_last_four"]=>
array(2) {
["type"]=>
string(13) "3dywyifwujcu2"
["value"]=>
string(4) "2acb"
}
["contact_ssn"]=>
array(2) {
["type"]=>
string(13) "2iztg3wbd7j5a"
["value"]=>
string(8) "311314c1"
}
}
*/
You can now use these values for inserting/updating records into your database.
To search the database at a later date, use getAllBlindIndexes()
or getBlindIndex()
:
<?php
use ParagonIE\CipherSweet\BlindIndex;
use ParagonIE\CipherSweet\CipherSweet;
use ParagonIE\CipherSweet\EncryptedField;
use ParagonIE\CipherSweet\Transformation\LastFourDigits;
/** @var CipherSweet $engine */
$ssn = (new EncryptedField($engine, 'contacts', 'ssn'))
// Add a blind index for the "last 4 of SSN":
->addBlindIndex(
new BlindIndex(
// Name (used in key splitting):
'contact_ssn_last_four',
// List of Transforms:
[new LastFourDigits()],
// Bloom filter size (bits)
16,
// Fast hash (default: false)
false
)
)
// Add a blind index for the full SSN:
->addBlindIndex(
new BlindIndex(
'contact_ssn',
[],
32
)
);
// Use these values in search queries:
$indexes = $ssn->getAllBlindIndexes('123-45-6789');
$lastFour = $ssn->getBlindIndex('123-45-6789', 'contact_ssn_last_four');
Which should result in the following (for the example key):
var_dump($lastFour);
/*
string(4) "2acb"
*/
var_dump($indexes);
/*
array(2) {
["contact_ssn_last_four"]=>
string(4) "2acb"
["contact_ssn"]=>
string(8) "311314c1"
}
*/
EncryptedField with AAD
Since version 1.6.0, both EncryptedField::encryptValue()
and
EncryptedField::prepareForStorage()
allow an optional string to be passed to
the second parameter, which will be included in the authentication tag on the
ciphertext. It will NOT be stored in the ciphertext.
Since version 4.7, you can use Enhanced AAD instead of a string.
/**
* @var string $ciphertext
* @var array<string, string> $indexes
*/
$aad = AAD::literal('example');
list ($ciphertext, $indexes) = $ssn->prepareForStorage($contactInfo['ssn'], $aad);
EncryptedRow
An alternative approach for datasets with multiple encrypted rows and/or
encrypted boolean fields is the EncryptedRow
API, which looks like this:
<?php
use ParagonIE\CipherSweet\BlindIndex;
use ParagonIE\CipherSweet\CipherSweet;
use ParagonIE\CipherSweet\CompoundIndex;
use ParagonIE\CipherSweet\EncryptedRow;
use ParagonIE\CipherSweet\Transformation\LastFourDigits;
/** @var CipherSweet $engine */
// Define two fields (one text, one boolean) that will be encrypted
$row = (new EncryptedRow($engine, 'contacts'))
->addTextField('ssn')
->addBooleanField('hivstatus');
// Add a normal Blind Index on one field:
$row->addBlindIndex(
'ssn',
new BlindIndex(
'contact_ssn_last_four',
[new LastFourDigits()],
32, // 32 bits = 4 bytes,
// Fast hash (default: false)
false
)
);
// Create/add a compound blind index on multiple fields:
$row->addCompoundIndex(
(
new CompoundIndex(
'contact_ssnlast4_hivstatus',
['ssn', 'hivstatus'],
32, // 32 bits = 4 bytes
true // fast hash
)
)->addTransform('ssn', new LastFourDigits())
);
// Notice: You're passing an entire array at once, not a string
$prepared = $row->prepareRowForStorage([
'extraneous' => true,
'ssn' => '123-45-6789',
'hivstatus' => false
]);
var_dump($prepared);
/*
array(2) {
[0]=>
array(3) {
["extraneous"]=>
bool(true)
["ssn"]=>
string(73) "nacl:wVMElYqnHrGB4hU118MTuANZXWHZjbsd0uK2N0Exz72mrV8sLrI_oU94vgsWlWJc84-u"
["hivstatus"]=>
string(61) "nacl:ctWDJBn-NgeWc2mqEWfakvxkG7qCmIKfPpnA7jXHdbZ2CPgnZF0Yzwg="
}
[1]=>
array(2) {
["contact_ssn_last_four"]=>
string(8) "2acbcd1c"
["contact_ssnlast4_hivstatus"]=>
string(8) "cbfd03c0"
}
}
*/
With the EncryptedRow
API, you can encrypt a subset of all of the fields
in a row, and create compound blind indexes based on multiple pieces of
data in the dataset rather than a single field, without writing a ton of
glue code.
If you want the old "typed" index style, simply call setTypedIndexes(true)
on any
EncryptedField
, EncryptedRow
or EncryptedMultiRows
object.
// Use flat indexes
$row->setFlatIndexes(true);
// Notice: You're passing an entire array at once, not a string
$prepared = $row->prepareRowForStorage([
'extraneous' => true,
'ssn' => '123-45-6789',
'hivstatus' => false
]);
var_dump($prepared);
/*
array(2) {
[0]=>
array(3) {
["extraneous"]=>
bool(true)
["ssn"]=>
string(73) "nacl:wVMElYqnHrGB4hU118MTuANZXWHZjbsd0uK2N0Exz72mrV8sLrI_oU94vgsWlWJc84-u"
["hivstatus"]=>
string(61) "nacl:ctWDJBn-NgeWc2mqEWfakvxkG7qCmIKfPpnA7jXHdbZ2CPgnZF0Yzwg="
}
[1]=>
array(2) {
["contact_ssn_last_four"]=>
array(2) {
["type"]=>
string(13) "3dywyifwujcu2"
["value"]=>
string(8) "2acbcd1c"
}
["contact_ssnlast4_hivstatus"]=>
array(2) {
["type"]=>
string(13) "nqtcc56kcf4qg"
["value"]=>
string(8) "cbfd03c0"
}
}
}
*/
EncryptedRow with a CompoundIndex using a custom Transform of Multiple Fields
It's possible to quickly create a compound index that uses a transformation that combines multiple fields into one output string.
Following the previous example:
<?php
use ParagonIE\CipherSweet\BlindIndex;
use ParagonIE\CipherSweet\CipherSweet;
use ParagonIE\CipherSweet\CompoundIndex;
use ParagonIE\CipherSweet\Contract\RowTransformationInterface;
use ParagonIE\CipherSweet\EncryptedRow;
use ParagonIE\CipherSweet\Transformation\LastFourDigits;
/**
* Class FirstInitialLastName
*/
class FirstInitialLastName implements RowTransformationInterface
{
/**
* @param array $input
* @param int $layer
*
* @return array|string
* @throws \Exception
*/
public function processArray(array $input, $layer = 0)
{
if (!\is_array($input)) {
throw new \TypeError('Compound Transformation expects an array');
}
return \strtolower($input['first_name'][0] . $input['last_name']);
}
/**
* Implementations can define their own prototypes, but
* this should almost always operate on a string, and must
* always return a string.
*
* @param mixed $input
* @return string
* @throws \Exception
*/
public function __invoke($input)
{
return $this->processArray($input);
}
}
/** @var CipherSweet $engine */
$row = (new EncryptedRow($engine, 'contacts'))
->addTextField('first_name')
->addTextField('last_name')
->addTextField('ssn')
->addBooleanField('hivstatus');
// Add a normal Blind Index on one field:
$row->addBlindIndex(
'ssn',
new BlindIndex(
'contact_ssn_last_four',
[new LastFourDigits()],
32 // 32 bits = 4 bytes
)
);
$row->addCompoundIndex(
(
new CompoundIndex(
'contact_ssnlast4_hivstatus',
['ssn', 'hivstatus'],
32, // 32 bits = 4 bytes
true // fast hash
)
)->addTransform('ssn', new LastFourDigits())
);
// Notice the ->addRowTransform() method:
$row->addCompoundIndex(
$row->createCompoundIndex(
'contact_first_init_last_name',
['first_name', 'last_name'],
64, // 64 bits = 8 bytes
true
)->addRowTransform(new FirstInitialLastName())
);
$prepared = $row->prepareRowForStorage([
'first_name' => 'Jane',
'last_name' => 'Doe',
'extraneous' => true,
'ssn' => '123-45-6789',
'hivstatus' => false
]);
var_dump($prepared);
/*
array(2) {
[0]=>
array(5) {
["first_name"]=>
string(141) "fips:fCCyMZOUMA95S3efKWEgL8Zq7RNYo7vX0pXZl3Ls1iM8k0ST_3y2VpeQQO4BET0EABkVUhnRvIbWXM-MA2gJw6uv1jvoR0nJwiRaHJOAknwvoKT-coHYJuwUT2v_qDAvZVbvdA=="
["last_name"]=>
string(137) "fips:AIJniZTOIaehOUE5fA8PnvUdQSGs24YhTK5bQO3T8wI7a_t11k_Ah5SnlAqjUEXeX-_PpvlbPapqagApxS4_QFjn74xc1IG3e8SaUi8wemxjl-udPWg0xML0wANsTQMCp3EE"
["extraneous"]=>
bool(true)
["ssn"]=>
string(149) "fips:oP6DuYYErL-lZqfgX1pOfjTJHzCNtx8w5ZBrT78sypnc5waFd7K-9Qu0-GojHFXqnlJe5Cvj9x1doooR6ijy1fIKle5JpzjZeSe0nbJP44atuNJqDg6JMkTSLsNylaQoULxEHR5mFTcAKOA="
["hivstatus"]=>
string(137) "fips:3QGNnjNPZTFNoSC4kKEWfevvcSQ1hRWhWrc9agh9PVPvWesJeZCwskFakeCFAB_5zSSRbKgGXFMlIk-2lJphJrl5OuHBmCSeB_E_mBU931k4rHfz3_OP-rGnB8H9CAfVpw=="
}
[1]=>
array(3) {
["contact_ssn_last_four"]=>
string(8) "a88e74ad"
["contact_ssnlast4_hivstatus"]=>
string(8) "417daacf"
["contact_first_init_last_name"]=>
string(16) "81f9316ceccea014"
}
}
*/
The above snippet defines a custom implementation of
RowTransformationInterface
that appends the first initial
and the last name.
Note: You can achieve the same overall effect (but not the same hash output) using the default CompoundIndex.
EncryptedRow with AAD
You can also specify a separate plaintext column (e.g. primary or foreign key) as additional authenticated data.
This binds the ciphertext to a specific row, thereby preventing an attacker capable of replacing ciphertexts and using legitimate app access to decrypt ciphertexts they wouldn't otherwise have access to.
$row->setAadSourceField('first_name', 'contact_id');
This can also be included during the table instantiation:
<?php
use ParagonIE\CipherSweet\CipherSweet;
use ParagonIE\CipherSweet\EncryptedRow;
/** @var CipherSweet $engine */
$row = (new EncryptedRow($engine, 'contacts'))
->addTextField('first_name', 'contact_id');
/* ... */
Since version 4.7, you can use Enhanced AAD instead of a string for the field name. You an also bind a field to multiple plaintext fields.
$aad = (AAD::field('contact_id'))
->merge(AAD::field('foreign_key_1'))
->merge(AAD::field('foreign_key_2'));
$row->setAadSourceField('first_name', $aad);
EncryptedMultiRows
CipherSweet also provides a multi-row abstraction to make it easier to manage heavily-normalized databases.
When working with EncryptedMultiRows
, your arrays should be formatted
as follows:
$input = [
'table1' => [
'column1' => 'value',
'columnB' => 123456,
// ...
],
'table2' => [ /* ... */ ],
// ...
];
For example:
<?php
use ParagonIE\CipherSweet\CipherSweet;
use ParagonIE\CipherSweet\Transformation\AlphaCharactersOnly;
use ParagonIE\CipherSweet\Transformation\FirstCharacter;
use ParagonIE\CipherSweet\Transformation\Lowercase;
use ParagonIE\CipherSweet\Backend\FIPSCrypto;
use ParagonIE\CipherSweet\KeyProvider\StringProvider;
use ParagonIE\CipherSweet\EncryptedMultiRows;
$provider = new StringProvider(
// Example key, chosen randomly, hex-encoded:
'a981d3894b5884f6965baea64a09bb5b4b59c10e857008fc814923cf2f2de558'
);
$engine = new CipherSweet($provider, new FIPSCrypto());
$rowSet = (new EncryptedMultiRows($engine))
->addTextField('contacts', 'first_name')
->addTextField('contacts', 'last_name')
->addFloatField('contacts', 'latitude')
->addFloatField('contacts', 'longitude')
->addTextField('foobar', 'test');
$rowSet->addCompoundIndex(
'contacts',
$rowSet->createCompoundIndex(
'contacts',
'contact_first_init_last_name',
['first_name', 'last_name'],
64, // 64 bits = 8 bytes
true
)
->addTransform('first_name', new AlphaCharactersOnly())
->addTransform('first_name', new Lowercase())
->addTransform('first_name', new FirstCharacter())
->addTransform('last_name', new AlphaCharactersOnly())
->addTransform('last_name', new Lowercase())
);
$prepared = $rowSet->prepareForStorage([
'contacts' => [
'contactid' => 12345,
'first_name' => 'Jane',
'last_name' => 'Doe',
'latitude' => 52.52,
'longitude' => -33.106,
'extraneous' => true
],
'foobar' => [
'foobarid' => 23,
'contactid' => 12345,
'test' => 'paragonie'
]
]);
var_dump($prepared);
This will produce something similar to the following output:
array(2) {
[0]=>
array(2) {
["contacts"]=>
array(6) {
["contactid"]=>
int(12345)
["first_name"]=>
string(141) "fips:8NSLNDWxN4u7OeN_v5ahnt-tgTNqrarsdhPwhMFT4uqtMsELj5L1D7KhukM1OSOKdwtgytiaut3-1kvtP8eSiIH8bQLidw3MwUFQ0JaxvNldI7rzVKeMP3yp4UVSrJZNH89nvQ=="
["last_name"]=>
string(137) "fips:uk9FtD5HvXY4Fe8_ibXF32FurmV8WvAUVSWUPVhOcfmHNC-nol7EnNjdQ5vBG2HQmpeRaTjSE5QZNZ9TQGeK-HgaO3V_MCVQDTtN2u9-3HR4ehSFjn8rHbGt31Ygrh4CV6WV"
["latitude"]=>
string(145) "fips:HE1PQoMso4FBu_rJWk0adWnp9i6HSBXQbf3QaHp1cw8-tOCDSm3rjiE1zIIrUmKarprPRzCTzb2BxdiXVg3RNsLH8iSko0ZmXSXhTa51XoEByxaH9fvAILpXttIfk8rsSXoIKgvMfcY="
["longitude"]=>
string(145) "fips:4gwnipUOws0kLW9gLmIgUNOM65ba1SVkibxILmJOpCbvw3853v_AaEGD-PO3b0fNwVnD6zbWdpovtHblAlXX2iOUvfqgrnwO21vPcYt8FaFkT706-_ZvbRioooL7NwFBqvJJWpiTnhA="
["extraneous"]=>
bool(true)
}
["foobar"]=>
array(3) {
["foobarid"]=>
int(23)
["contactid"]=>
int(12345)
["test"]=>
string(145) "fips:vnoJ6rIEBBMLCvXMt4gke8CT6PomgAExNufTZUrpPd3rp9y28jgopmXA7w8reqVe3SfE6KhRvN-lt5GQhzR1miQPVaIVq2V6D1i4eZCSKQDBmJ7PTAYuigNd9DPSL4qW3OAOtvagJ4Lc"
}
}
[1]=>
array(2) {
["contacts"]=>
array(1) {
["contact_first_init_last_name"]=>
string(16) "546b1ffd1f83c37a"
}
["foobar"]=>
array(0) {
}
}
}
EncryptedMultiRows with AAD
You can specify a separate plaintext column (e.g. primary or foreign key) as additional authenticated data.
This binds the ciphertext to a specific row, thereby preventing an attacker capable of replacing ciphertexts and using legitimate app access to decrypt ciphertexts they wouldn't otherwise have access to.
$rowSet->setAadSourceField('contacts', 'first_name', 'contact_id');
This can also be included during the table instantiation:
<?php
use ParagonIE\CipherSweet\CipherSweet;
use ParagonIE\CipherSweet\EncryptedMultiRows;
/** @var CipherSweet $engine */
$rowSet = (new EncryptedMultiRows($engine))
->addTextField('contacts', 'first_name', 'contact_id');
/* ... */
Since version 4.7, you can use Enhanced AAD instead of a string for the field name. You an also bind a field to multiple plaintext fields.
$aad = (AAD::field('contact_id'))
->merge(AAD::field('foreign_key_1'))
->merge(AAD::field('foreign_key_2'));
$rowSet->setAadSourceField('contacts', 'first_name', $aad);
EncryptedMultiRows with Automatic Context-Binding
Since version 4.7, you can have CipherSweet automatically generate AAD for a given EncryptedMultiRows
object.
For example, given the following configuration:
<?php
use ParagonIE\CipherSweet\CipherSweet;
use ParagonIE\CipherSweet\EncryptedMultiRows;
/** @var CipherSweet $engine */
$multiRowEncryptor = new EncryptedMultiRows($engine);
$multiRowEncryptor
->addTextField('table1', 'field1')
->addIntegerField('table1', 'field2')
->addFloatField('table1', 'field3')
->addOptionalBooleanField('table1', 'field4')
->addTextField('table2', 'foo')
->addTextField('table3', 'bar');
$encrypted = $multiRowEncryptor->encryptManyRows([
'table1' => ['field1' => 'hello world', 'field2' => 42, 'field3' => 3.1416],
'table2' => ['id' => 3, 'foo' => 'joy'],
'table3' => ['foo' => 'coy'],
]);
The diff between the previous code snippet and Easy Mode looks like this:
$multiRowEncryptor = new EncryptedMultiRows($engine);
$multiRowEncryptor
+ ->setAutoBindContext(true)
+ ->setPrimaryKeyColumn('table2', 'id')
->addTextField('table1', 'field1')
With this change, every encrypted field is explicitly cryptographically bound to its context (table name, field name) with no further action needed from the developer.
Additionally, table2
is cryptographically bound to its primary key (id
). This has two consequences:
- You cannot copy ciphertexts between rows and decrypt successfully. This is a good thing.
- However, you must know the primary key when inserting new records, in order to provide it to CipherSweet.
That second point is the main reason why we are not enabling it by default. (Also, we'd kind of need to know your primary key naming convention, which we cannot know for everyone that uses this library.)
That said, we highly recommend binding to the primary key column.
Next: Blind Index Planning