Merge tag 'lsk-v3.10-android-15.02'

LSK Android 15.02 v3.10

Conflicts:
	drivers/Kconfig
	drivers/regulator/core.c
	include/linux/of.h
This commit is contained in:
Huang, Tao
2015-03-05 17:11:40 +08:00
331 changed files with 12741 additions and 2023 deletions

View File

@@ -0,0 +1,28 @@
What: /sys/firmware/devicetree/*
Date: November 2013
Contact: Grant Likely <grant.likely@linaro.org>
Description:
When using OpenFirmware or a Flattened Device Tree to enumerate
hardware, the device tree structure will be exposed in this
directory.
It is possible for multiple device-tree directories to exist.
Some device drivers use a separate detached device tree which
have no attachment to the system tree and will appear in a
different subdirectory under /sys/firmware/devicetree.
Userspace must not use the /sys/firmware/devicetree/base
path directly, but instead should follow /proc/device-tree
symlink. It is possible that the absolute path will change
in the future, but the symlink is the stable ABI.
The /proc/device-tree symlink replaces the devicetree /proc
filesystem support, and has largely the same semantics and
should be compatible with existing userspace.
The contents of /sys/firmware/devicetree/ is a
hierarchy of directories, one per device tree node. The
directory name is the resolved path component name (node
name plus address). Properties are represented as files
in the directory. The contents of each file is the exact
binary data from the device tree.

View File

@@ -0,0 +1,40 @@
A DT changeset is a method which allows one to apply changes
in the live tree in such a way that either the full set of changes
will be applied, or none of them will be. If an error occurs partway
through applying the changeset, then the tree will be rolled back to the
previous state. A changeset can also be removed after it has been
applied.
When a changeset is applied, all of the changes get applied to the tree
at once before emitting OF_RECONFIG notifiers. This is so that the
receiver sees a complete and consistent state of the tree when it
receives the notifier.
The sequence of a changeset is as follows.
1. of_changeset_init() - initializes a changeset
2. A number of DT tree change calls, of_changeset_attach_node(),
of_changeset_detach_node(), of_changeset_add_property(),
of_changeset_remove_property, of_changeset_update_property() to prepare
a set of changes. No changes to the active tree are made at this point.
All the change operations are recorded in the of_changeset 'entries'
list.
3. mutex_lock(of_mutex) - starts a changeset; The global of_mutex
ensures there can only be one editor at a time.
4. of_changeset_apply() - Apply the changes to the tree. Either the
entire changeset will get applied, or if there is an error the tree will
be restored to the previous state
5. mutex_unlock(of_mutex) - All operations complete, release the mutex
If a successfully applied changeset needs to be removed, it can be done
with the following sequence.
1. mutex_lock(of_mutex)
2. of_changeset_revert()
3. mutex_unlock(of_mutex)

View File

@@ -0,0 +1,25 @@
Device Tree Dynamic Resolver Notes
----------------------------------
This document describes the implementation of the in-kernel
Device Tree resolver, residing in drivers/of/resolver.c and is a
companion document to Documentation/devicetree/dt-object-internal.txt[1]
How the resolver works
----------------------
The resolver is given as an input an arbitrary tree compiled with the
proper dtc option and having a /plugin/ tag. This generates the
appropriate __fixups__ & __local_fixups__ nodes as described in [1].
In sequence the resolver works by the following steps:
1. Get the maximum device tree phandle value from the live tree + 1.
2. Adjust all the local phandles of the tree to resolve by that amount.
3. Using the __local__fixups__ node information adjust all local references
by the same amount.
4. For each property in the __fixups__ node locate the node it references
in the live tree. This is the label used to tag the node.
5. Retrieve the phandle of the target of the fixup.
6. For each fixup in the property locate the node:property:offset location
and replace it with the phandle value.

View File

@@ -0,0 +1,133 @@
Device Tree Overlay Notes
-------------------------
This document describes the implementation of the in-kernel
device tree overlay functionality residing in drivers/of/overlay.c and is a
companion document to Documentation/devicetree/dt-object-internal.txt[1] &
Documentation/devicetree/dynamic-resolution-notes.txt[2]
How overlays work
-----------------
A Device Tree's overlay purpose is to modify the kernel's live tree, and
have the modification affecting the state of the the kernel in a way that
is reflecting the changes.
Since the kernel mainly deals with devices, any new device node that result
in an active device should have it created while if the device node is either
disabled or removed all together, the affected device should be deregistered.
Lets take an example where we have a foo board with the following base tree
which is taken from [1].
---- foo.dts -----------------------------------------------------------------
/* FOO platform */
/ {
compatible = "corp,foo";
/* shared resources */
res: res {
};
/* On chip peripherals */
ocp: ocp {
/* peripherals that are always instantiated */
peripheral1 { ... };
}
};
---- foo.dts -----------------------------------------------------------------
The overlay bar.dts, when loaded (and resolved as described in [2]) should
---- bar.dts -----------------------------------------------------------------
/plugin/; /* allow undefined label references and record them */
/ {
.... /* various properties for loader use; i.e. part id etc. */
fragment@0 {
target = <&ocp>;
__overlay__ {
/* bar peripheral */
bar {
compatible = "corp,bar";
... /* various properties and child nodes */
}
};
};
};
---- bar.dts -----------------------------------------------------------------
result in foo+bar.dts
---- foo+bar.dts -------------------------------------------------------------
/* FOO platform + bar peripheral */
/ {
compatible = "corp,foo";
/* shared resources */
res: res {
};
/* On chip peripherals */
ocp: ocp {
/* peripherals that are always instantiated */
peripheral1 { ... };
/* bar peripheral */
bar {
compatible = "corp,bar";
... /* various properties and child nodes */
}
}
};
---- foo+bar.dts -------------------------------------------------------------
As a result of the the overlay, a new device node (bar) has been created
so a bar platform device will be registered and if a matching device driver
is loaded the device will be created as expected.
Overlay in-kernel API
--------------------------------
The API is quite easy to use.
1. Call of_overlay_create() to create and apply an overlay. The return value
is a cookie identifying this overlay.
2. Call of_overlay_destroy() to remove and cleanup the overlay previously
created via the call to of_overlay_create(). Removal of an overlay that
is stacked by another will not be permitted.
Finally, if you need to remove all overlays in one-go, just call
of_overlay_destroy_all() which will remove every single one in the correct
order.
Overlay DTS Format
------------------
The DTS of an overlay should have the following format:
{
/* ignored properties by the overlay */
fragment@0 { /* first child node */
target=<phandle>; /* phandle target of the overlay */
or
target-path="/path"; /* target path of the overlay */
__overlay__ {
property-a; /* add property-a to the target */
node-a { /* add to an existing, or create a node-a */
...
};
};
}
fragment@1 { /* second child node */
...
};
/* more fragments follow */
}
Using the non-phandle based target method allows one to use a base DT which does
not contain a __symbols__ node, i.e. it was not compiled with the -@ option.
The __symbols__ node is only required for the target=<phandle> method, since it
contains the information required to map from a phandle to a tree location.

View File

@@ -1061,6 +1061,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
i8042.notimeout [HW] Ignore timeout condition signalled by controller
i8042.reset [HW] Reset the controller during init and cleanup
i8042.unlock [HW] Unlock (ignore) the keylock
i8042.kbdreset [HW] Reset device connected to KBD port
i810= [HW,DRM]

View File

@@ -1,6 +1,6 @@
VERSION = 3
PATCHLEVEL = 10
SUBLEVEL = 65
SUBLEVEL = 68
EXTRAVERSION =
NAME = TOSSUG Baby Fish

View File

@@ -20,7 +20,7 @@
/* this is for console on PGU */
/* bootargs = "console=tty0 consoleblank=0"; */
/* this is for console on serial */
bootargs = "earlycon=uart8250,mmio32,0xc0000000,115200n8 console=tty0 console=ttyS0,115200n8 consoleblank=0 debug";
bootargs = "earlycon=uart8250,mmio32,0xf0000000,115200n8 console=tty0 console=ttyS0,115200n8 consoleblank=0 debug";
};
aliases {
@@ -46,9 +46,9 @@
#interrupt-cells = <1>;
};
uart0: serial@c0000000 {
uart0: serial@f0000000 {
compatible = "ns8250";
reg = <0xc0000000 0x2000>;
reg = <0xf0000000 0x2000>;
interrupts = <11>;
clock-frequency = <3686400>;
baud = <115200>;
@@ -57,21 +57,21 @@
no-loopback-test = <1>;
};
pgu0: pgu@c9000000 {
pgu0: pgu@f9000000 {
compatible = "snps,arcpgufb";
reg = <0xc9000000 0x400>;
reg = <0xf9000000 0x400>;
};
ps2: ps2@c9001000 {
ps2: ps2@f9001000 {
compatible = "snps,arc_ps2";
reg = <0xc9000400 0x14>;
reg = <0xf9000400 0x14>;
interrupts = <13>;
interrupt-names = "arc_ps2_irq";
};
eth0: ethernet@c0003000 {
eth0: ethernet@f0003000 {
compatible = "snps,oscilan";
reg = <0xc0003000 0x44>;
reg = <0xf0003000 0x44>;
interrupts = <7>, <8>;
interrupt-names = "rx", "tx";
};

View File

@@ -2386,6 +2386,13 @@ config NEON
Say Y to include support code for NEON, the ARMv7 Advanced SIMD
Extension.
config KERNEL_MODE_NEON
bool "Support for NEON in kernel mode"
default n
depends on NEON
help
Say Y to include support for NEON in kernel mode.
endmenu
menu "Userspace binary formats"

View File

@@ -141,7 +141,7 @@
#size-cells = <0>;
compatible = "fsl,imx25-cspi", "fsl,imx35-cspi";
reg = <0x43fa4000 0x4000>;
clocks = <&clks 62>, <&clks 62>;
clocks = <&clks 78>, <&clks 78>;
clock-names = "ipg", "per";
interrupts = <14>;
status = "disabled";
@@ -335,7 +335,7 @@
compatible = "fsl,imx25-pwm", "fsl,imx27-pwm";
#pwm-cells = <2>;
reg = <0x53fa0000 0x4000>;
clocks = <&clks 106>, <&clks 36>;
clocks = <&clks 106>, <&clks 52>;
clock-names = "ipg", "per";
interrupts = <36>;
};
@@ -354,7 +354,7 @@
compatible = "fsl,imx25-pwm", "fsl,imx27-pwm";
#pwm-cells = <2>;
reg = <0x53fa8000 0x4000>;
clocks = <&clks 107>, <&clks 36>;
clocks = <&clks 107>, <&clks 52>;
clock-names = "ipg", "per";
interrupts = <41>;
};
@@ -394,7 +394,7 @@
pwm4: pwm@53fc8000 {
compatible = "fsl,imx25-pwm", "fsl,imx27-pwm";
reg = <0x53fc8000 0x4000>;
clocks = <&clks 108>, <&clks 36>;
clocks = <&clks 108>, <&clks 52>;
clock-names = "ipg", "per";
interrupts = <42>;
};
@@ -439,7 +439,7 @@
compatible = "fsl,imx25-pwm", "fsl,imx27-pwm";
#pwm-cells = <2>;
reg = <0x53fe0000 0x4000>;
clocks = <&clks 105>, <&clks 36>;
clocks = <&clks 105>, <&clks 52>;
clock-names = "ipg", "per";
interrupts = <26>;
};

View File

@@ -1,6 +1,9 @@
/ {
testcase-data {
security-password = "password";
duplicate-name = "duplicate";
duplicate-name { };
phandle-tests {
provider0: provider0 {
#phandle-cells = <0>;

1
arch/arm/crypto/.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
aesbs-core.S

View File

@@ -3,7 +3,21 @@
#
obj-$(CONFIG_CRYPTO_AES_ARM) += aes-arm.o
obj-$(CONFIG_CRYPTO_AES_ARM_BS) += aes-arm-bs.o
obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o
obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
obj-$(CONFIG_CRYPTO_SHA512_ARM_NEON) += sha512-arm-neon.o
aes-arm-y := aes-armv4.o aes_glue.o
sha1-arm-y := sha1-armv4-large.o sha1_glue.o
aes-arm-y := aes-armv4.o aes_glue.o
aes-arm-bs-y := aesbs-core.o aesbs-glue.o
sha1-arm-y := sha1-armv4-large.o sha1_glue.o
sha1-arm-neon-y := sha1-armv7-neon.o sha1_neon_glue.o
sha512-arm-neon-y := sha512-armv7-neon.o sha512_neon_glue.o
quiet_cmd_perl = PERL $@
cmd_perl = $(PERL) $(<) > $(@)
$(src)/aesbs-core.S_shipped: $(src)/bsaes-armv7.pl
$(call cmd,perl)
.PRECIOUS: $(obj)/aesbs-core.S

View File

@@ -6,22 +6,12 @@
#include <linux/crypto.h>
#include <crypto/aes.h>
#define AES_MAXNR 14
#include "aes_glue.h"
typedef struct {
unsigned int rd_key[4 *(AES_MAXNR + 1)];
int rounds;
} AES_KEY;
struct AES_CTX {
AES_KEY enc_key;
AES_KEY dec_key;
};
asmlinkage void AES_encrypt(const u8 *in, u8 *out, AES_KEY *ctx);
asmlinkage void AES_decrypt(const u8 *in, u8 *out, AES_KEY *ctx);
asmlinkage int private_AES_set_decrypt_key(const unsigned char *userKey, const int bits, AES_KEY *key);
asmlinkage int private_AES_set_encrypt_key(const unsigned char *userKey, const int bits, AES_KEY *key);
EXPORT_SYMBOL(AES_encrypt);
EXPORT_SYMBOL(AES_decrypt);
EXPORT_SYMBOL(private_AES_set_encrypt_key);
EXPORT_SYMBOL(private_AES_set_decrypt_key);
static void aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
@@ -81,7 +71,7 @@ static struct crypto_alg aes_alg = {
.cipher = {
.cia_min_keysize = AES_MIN_KEY_SIZE,
.cia_max_keysize = AES_MAX_KEY_SIZE,
.cia_setkey = aes_set_key,
.cia_setkey = aes_set_key,
.cia_encrypt = aes_encrypt,
.cia_decrypt = aes_decrypt
}
@@ -103,6 +93,6 @@ module_exit(aes_fini);
MODULE_DESCRIPTION("Rijndael (AES) Cipher Algorithm (ASM)");
MODULE_LICENSE("GPL");
MODULE_ALIAS("aes");
MODULE_ALIAS("aes-asm");
MODULE_ALIAS_CRYPTO("aes");
MODULE_ALIAS_CRYPTO("aes-asm");
MODULE_AUTHOR("David McCullough <ucdevel@gmail.com>");

View File

@@ -0,0 +1,19 @@
#define AES_MAXNR 14
struct AES_KEY {
unsigned int rd_key[4 * (AES_MAXNR + 1)];
int rounds;
};
struct AES_CTX {
struct AES_KEY enc_key;
struct AES_KEY dec_key;
};
asmlinkage void AES_encrypt(const u8 *in, u8 *out, struct AES_KEY *ctx);
asmlinkage void AES_decrypt(const u8 *in, u8 *out, struct AES_KEY *ctx);
asmlinkage int private_AES_set_decrypt_key(const unsigned char *userKey,
const int bits, struct AES_KEY *key);
asmlinkage int private_AES_set_encrypt_key(const unsigned char *userKey,
const int bits, struct AES_KEY *key);

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,434 @@
/*
* linux/arch/arm/crypto/aesbs-glue.c - glue code for NEON bit sliced AES
*
* Copyright (C) 2013 Linaro Ltd <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <asm/neon.h>
#include <crypto/aes.h>
#include <crypto/ablk_helper.h>
#include <crypto/algapi.h>
#include <linux/module.h>
#include "aes_glue.h"
#define BIT_SLICED_KEY_MAXSIZE (128 * (AES_MAXNR - 1) + 2 * AES_BLOCK_SIZE)
struct BS_KEY {
struct AES_KEY rk;
int converted;
u8 __aligned(8) bs[BIT_SLICED_KEY_MAXSIZE];
} __aligned(8);
asmlinkage void bsaes_enc_key_convert(u8 out[], struct AES_KEY const *in);
asmlinkage void bsaes_dec_key_convert(u8 out[], struct AES_KEY const *in);
asmlinkage void bsaes_cbc_encrypt(u8 const in[], u8 out[], u32 bytes,
struct BS_KEY *key, u8 iv[]);
asmlinkage void bsaes_ctr32_encrypt_blocks(u8 const in[], u8 out[], u32 blocks,
struct BS_KEY *key, u8 const iv[]);
asmlinkage void bsaes_xts_encrypt(u8 const in[], u8 out[], u32 bytes,
struct BS_KEY *key, u8 tweak[]);
asmlinkage void bsaes_xts_decrypt(u8 const in[], u8 out[], u32 bytes,
struct BS_KEY *key, u8 tweak[]);
struct aesbs_cbc_ctx {
struct AES_KEY enc;
struct BS_KEY dec;
};
struct aesbs_ctr_ctx {
struct BS_KEY enc;
};
struct aesbs_xts_ctx {
struct BS_KEY enc;
struct BS_KEY dec;
struct AES_KEY twkey;
};
static int aesbs_cbc_set_key(struct crypto_tfm *tfm, const u8 *in_key,
unsigned int key_len)
{
struct aesbs_cbc_ctx *ctx = crypto_tfm_ctx(tfm);
int bits = key_len * 8;
if (private_AES_set_encrypt_key(in_key, bits, &ctx->enc)) {
tfm->crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN;
return -EINVAL;
}
ctx->dec.rk = ctx->enc;
private_AES_set_decrypt_key(in_key, bits, &ctx->dec.rk);
ctx->dec.converted = 0;
return 0;
}
static int aesbs_ctr_set_key(struct crypto_tfm *tfm, const u8 *in_key,
unsigned int key_len)
{
struct aesbs_ctr_ctx *ctx = crypto_tfm_ctx(tfm);
int bits = key_len * 8;
if (private_AES_set_encrypt_key(in_key, bits, &ctx->enc.rk)) {
tfm->crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN;
return -EINVAL;
}
ctx->enc.converted = 0;
return 0;
}
static int aesbs_xts_set_key(struct crypto_tfm *tfm, const u8 *in_key,
unsigned int key_len)
{
struct aesbs_xts_ctx *ctx = crypto_tfm_ctx(tfm);
int bits = key_len * 4;
if (private_AES_set_encrypt_key(in_key, bits, &ctx->enc.rk)) {
tfm->crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN;
return -EINVAL;
}
ctx->dec.rk = ctx->enc.rk;
private_AES_set_decrypt_key(in_key, bits, &ctx->dec.rk);
private_AES_set_encrypt_key(in_key + key_len / 2, bits, &ctx->twkey);
ctx->enc.converted = ctx->dec.converted = 0;
return 0;
}
static int aesbs_cbc_encrypt(struct blkcipher_desc *desc,
struct scatterlist *dst,
struct scatterlist *src, unsigned int nbytes)
{
struct aesbs_cbc_ctx *ctx = crypto_blkcipher_ctx(desc->tfm);
struct blkcipher_walk walk;
int err;
blkcipher_walk_init(&walk, dst, src, nbytes);
err = blkcipher_walk_virt(desc, &walk);
while (walk.nbytes) {
u32 blocks = walk.nbytes / AES_BLOCK_SIZE;
u8 *src = walk.src.virt.addr;
if (walk.dst.virt.addr == walk.src.virt.addr) {
u8 *iv = walk.iv;
do {
crypto_xor(src, iv, AES_BLOCK_SIZE);
AES_encrypt(src, src, &ctx->enc);
iv = src;
src += AES_BLOCK_SIZE;
} while (--blocks);
memcpy(walk.iv, iv, AES_BLOCK_SIZE);
} else {
u8 *dst = walk.dst.virt.addr;
do {
crypto_xor(walk.iv, src, AES_BLOCK_SIZE);
AES_encrypt(walk.iv, dst, &ctx->enc);
memcpy(walk.iv, dst, AES_BLOCK_SIZE);
src += AES_BLOCK_SIZE;
dst += AES_BLOCK_SIZE;
} while (--blocks);
}
err = blkcipher_walk_done(desc, &walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err;
}
static int aesbs_cbc_decrypt(struct blkcipher_desc *desc,
struct scatterlist *dst,
struct scatterlist *src, unsigned int nbytes)
{
struct aesbs_cbc_ctx *ctx = crypto_blkcipher_ctx(desc->tfm);
struct blkcipher_walk walk;
int err;
blkcipher_walk_init(&walk, dst, src, nbytes);
err = blkcipher_walk_virt_block(desc, &walk, 8 * AES_BLOCK_SIZE);
while ((walk.nbytes / AES_BLOCK_SIZE) >= 8) {
kernel_neon_begin();
bsaes_cbc_encrypt(walk.src.virt.addr, walk.dst.virt.addr,
walk.nbytes, &ctx->dec, walk.iv);
kernel_neon_end();
err = blkcipher_walk_done(desc, &walk, walk.nbytes % AES_BLOCK_SIZE);
}
while (walk.nbytes) {
u32 blocks = walk.nbytes / AES_BLOCK_SIZE;
u8 *dst = walk.dst.virt.addr;
u8 *src = walk.src.virt.addr;
u8 bk[2][AES_BLOCK_SIZE];
u8 *iv = walk.iv;
do {
if (walk.dst.virt.addr == walk.src.virt.addr)
memcpy(bk[blocks & 1], src, AES_BLOCK_SIZE);
AES_decrypt(src, dst, &ctx->dec.rk);
crypto_xor(dst, iv, AES_BLOCK_SIZE);
if (walk.dst.virt.addr == walk.src.virt.addr)
iv = bk[blocks & 1];
else
iv = src;
dst += AES_BLOCK_SIZE;
src += AES_BLOCK_SIZE;
} while (--blocks);
err = blkcipher_walk_done(desc, &walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err;
}
static void inc_be128_ctr(__be32 ctr[], u32 addend)
{
int i;
for (i = 3; i >= 0; i--, addend = 1) {
u32 n = be32_to_cpu(ctr[i]) + addend;
ctr[i] = cpu_to_be32(n);
if (n >= addend)
break;
}
}
static int aesbs_ctr_encrypt(struct blkcipher_desc *desc,
struct scatterlist *dst, struct scatterlist *src,
unsigned int nbytes)
{
struct aesbs_ctr_ctx *ctx = crypto_blkcipher_ctx(desc->tfm);
struct blkcipher_walk walk;
u32 blocks;
int err;
blkcipher_walk_init(&walk, dst, src, nbytes);
err = blkcipher_walk_virt_block(desc, &walk, 8 * AES_BLOCK_SIZE);
while ((blocks = walk.nbytes / AES_BLOCK_SIZE)) {
u32 tail = walk.nbytes % AES_BLOCK_SIZE;
__be32 *ctr = (__be32 *)walk.iv;
u32 headroom = UINT_MAX - be32_to_cpu(ctr[3]);
/* avoid 32 bit counter overflow in the NEON code */
if (unlikely(headroom < blocks)) {
blocks = headroom + 1;
tail = walk.nbytes - blocks * AES_BLOCK_SIZE;
}
kernel_neon_begin();
bsaes_ctr32_encrypt_blocks(walk.src.virt.addr,
walk.dst.virt.addr, blocks,
&ctx->enc, walk.iv);
kernel_neon_end();
inc_be128_ctr(ctr, blocks);
nbytes -= blocks * AES_BLOCK_SIZE;
if (nbytes && nbytes == tail && nbytes <= AES_BLOCK_SIZE)
break;
err = blkcipher_walk_done(desc, &walk, tail);
}
if (walk.nbytes) {
u8 *tdst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
u8 *tsrc = walk.src.virt.addr + blocks * AES_BLOCK_SIZE;
u8 ks[AES_BLOCK_SIZE];
AES_encrypt(walk.iv, ks, &ctx->enc.rk);
if (tdst != tsrc)
memcpy(tdst, tsrc, nbytes);
crypto_xor(tdst, ks, nbytes);
err = blkcipher_walk_done(desc, &walk, 0);
}
return err;
}
static int aesbs_xts_encrypt(struct blkcipher_desc *desc,
struct scatterlist *dst,
struct scatterlist *src, unsigned int nbytes)
{
struct aesbs_xts_ctx *ctx = crypto_blkcipher_ctx(desc->tfm);
struct blkcipher_walk walk;
int err;
blkcipher_walk_init(&walk, dst, src, nbytes);
err = blkcipher_walk_virt_block(desc, &walk, 8 * AES_BLOCK_SIZE);
/* generate the initial tweak */
AES_encrypt(walk.iv, walk.iv, &ctx->twkey);
while (walk.nbytes) {
kernel_neon_begin();
bsaes_xts_encrypt(walk.src.virt.addr, walk.dst.virt.addr,
walk.nbytes, &ctx->enc, walk.iv);
kernel_neon_end();
err = blkcipher_walk_done(desc, &walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err;
}
static int aesbs_xts_decrypt(struct blkcipher_desc *desc,
struct scatterlist *dst,
struct scatterlist *src, unsigned int nbytes)
{
struct aesbs_xts_ctx *ctx = crypto_blkcipher_ctx(desc->tfm);
struct blkcipher_walk walk;
int err;
blkcipher_walk_init(&walk, dst, src, nbytes);
err = blkcipher_walk_virt_block(desc, &walk, 8 * AES_BLOCK_SIZE);
/* generate the initial tweak */
AES_encrypt(walk.iv, walk.iv, &ctx->twkey);
while (walk.nbytes) {
kernel_neon_begin();
bsaes_xts_decrypt(walk.src.virt.addr, walk.dst.virt.addr,
walk.nbytes, &ctx->dec, walk.iv);
kernel_neon_end();
err = blkcipher_walk_done(desc, &walk, walk.nbytes % AES_BLOCK_SIZE);
}
return err;
}
static struct crypto_alg aesbs_algs[] = { {
.cra_name = "__cbc-aes-neonbs",
.cra_driver_name = "__driver-cbc-aes-neonbs",
.cra_priority = 0,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct aesbs_cbc_ctx),
.cra_alignmask = 7,
.cra_type = &crypto_blkcipher_type,
.cra_module = THIS_MODULE,
.cra_blkcipher = {
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = aesbs_cbc_set_key,
.encrypt = aesbs_cbc_encrypt,
.decrypt = aesbs_cbc_decrypt,
},
}, {
.cra_name = "__ctr-aes-neonbs",
.cra_driver_name = "__driver-ctr-aes-neonbs",
.cra_priority = 0,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
.cra_blocksize = 1,
.cra_ctxsize = sizeof(struct aesbs_ctr_ctx),
.cra_alignmask = 7,
.cra_type = &crypto_blkcipher_type,
.cra_module = THIS_MODULE,
.cra_blkcipher = {
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = aesbs_ctr_set_key,
.encrypt = aesbs_ctr_encrypt,
.decrypt = aesbs_ctr_encrypt,
},
}, {
.cra_name = "__xts-aes-neonbs",
.cra_driver_name = "__driver-xts-aes-neonbs",
.cra_priority = 0,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct aesbs_xts_ctx),
.cra_alignmask = 7,
.cra_type = &crypto_blkcipher_type,
.cra_module = THIS_MODULE,
.cra_blkcipher = {
.min_keysize = 2 * AES_MIN_KEY_SIZE,
.max_keysize = 2 * AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = aesbs_xts_set_key,
.encrypt = aesbs_xts_encrypt,
.decrypt = aesbs_xts_decrypt,
},
}, {
.cra_name = "cbc(aes)",
.cra_driver_name = "cbc-aes-neonbs",
.cra_priority = 300,
.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER|CRYPTO_ALG_ASYNC,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct async_helper_ctx),
.cra_alignmask = 7,
.cra_type = &crypto_ablkcipher_type,
.cra_module = THIS_MODULE,
.cra_init = ablk_init,
.cra_exit = ablk_exit,
.cra_ablkcipher = {
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = ablk_set_key,
.encrypt = __ablk_encrypt,
.decrypt = ablk_decrypt,
}
}, {
.cra_name = "ctr(aes)",
.cra_driver_name = "ctr-aes-neonbs",
.cra_priority = 300,
.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER|CRYPTO_ALG_ASYNC,
.cra_blocksize = 1,
.cra_ctxsize = sizeof(struct async_helper_ctx),
.cra_alignmask = 7,
.cra_type = &crypto_ablkcipher_type,
.cra_module = THIS_MODULE,
.cra_init = ablk_init,
.cra_exit = ablk_exit,
.cra_ablkcipher = {
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = ablk_set_key,
.encrypt = ablk_encrypt,
.decrypt = ablk_decrypt,
}
}, {
.cra_name = "xts(aes)",
.cra_driver_name = "xts-aes-neonbs",
.cra_priority = 300,
.cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER|CRYPTO_ALG_ASYNC,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct async_helper_ctx),
.cra_alignmask = 7,
.cra_type = &crypto_ablkcipher_type,
.cra_module = THIS_MODULE,
.cra_init = ablk_init,
.cra_exit = ablk_exit,
.cra_ablkcipher = {
.min_keysize = 2 * AES_MIN_KEY_SIZE,
.max_keysize = 2 * AES_MAX_KEY_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = ablk_set_key,
.encrypt = ablk_encrypt,
.decrypt = ablk_decrypt,
}
} };
static int __init aesbs_mod_init(void)
{
if (!cpu_has_neon())
return -ENODEV;
return crypto_register_algs(aesbs_algs, ARRAY_SIZE(aesbs_algs));
}
static void __exit aesbs_mod_exit(void)
{
crypto_unregister_algs(aesbs_algs, ARRAY_SIZE(aesbs_algs));
}
module_init(aesbs_mod_init);
module_exit(aesbs_mod_exit);
MODULE_DESCRIPTION("Bit sliced AES in CBC/CTR/XTS modes using NEON");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL");

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,634 @@
/* sha1-armv7-neon.S - ARM/NEON accelerated SHA-1 transform function
*
* Copyright © 2013-2014 Jussi Kivilinna <jussi.kivilinna@iki.fi>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the Free
* Software Foundation; either version 2 of the License, or (at your option)
* any later version.
*/
#include <linux/linkage.h>
.syntax unified
.code 32
.fpu neon
.text
/* Context structure */
#define state_h0 0
#define state_h1 4
#define state_h2 8
#define state_h3 12
#define state_h4 16
/* Constants */
#define K1 0x5A827999
#define K2 0x6ED9EBA1
#define K3 0x8F1BBCDC
#define K4 0xCA62C1D6
.align 4
.LK_VEC:
.LK1: .long K1, K1, K1, K1
.LK2: .long K2, K2, K2, K2
.LK3: .long K3, K3, K3, K3
.LK4: .long K4, K4, K4, K4
/* Register macros */
#define RSTATE r0
#define RDATA r1
#define RNBLKS r2
#define ROLDSTACK r3
#define RWK lr
#define _a r4
#define _b r5
#define _c r6
#define _d r7
#define _e r8
#define RT0 r9
#define RT1 r10
#define RT2 r11
#define RT3 r12
#define W0 q0
#define W1 q1
#define W2 q2
#define W3 q3
#define W4 q4
#define W5 q5
#define W6 q6
#define W7 q7
#define tmp0 q8
#define tmp1 q9
#define tmp2 q10
#define tmp3 q11
#define qK1 q12
#define qK2 q13
#define qK3 q14
#define qK4 q15
/* Round function macros. */
#define WK_offs(i) (((i) & 15) * 4)
#define _R_F1(a,b,c,d,e,i,pre1,pre2,pre3,i16,\
W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
ldr RT3, [sp, WK_offs(i)]; \
pre1(i16,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28); \
bic RT0, d, b; \
add e, e, a, ror #(32 - 5); \
and RT1, c, b; \
pre2(i16,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28); \
add RT0, RT0, RT3; \
add e, e, RT1; \
ror b, #(32 - 30); \
pre3(i16,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28); \
add e, e, RT0;
#define _R_F2(a,b,c,d,e,i,pre1,pre2,pre3,i16,\
W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
ldr RT3, [sp, WK_offs(i)]; \
pre1(i16,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28); \
eor RT0, d, b; \
add e, e, a, ror #(32 - 5); \
eor RT0, RT0, c; \
pre2(i16,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28); \
add e, e, RT3; \
ror b, #(32 - 30); \
pre3(i16,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28); \
add e, e, RT0; \
#define _R_F3(a,b,c,d,e,i,pre1,pre2,pre3,i16,\
W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
ldr RT3, [sp, WK_offs(i)]; \
pre1(i16,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28); \
eor RT0, b, c; \
and RT1, b, c; \
add e, e, a, ror #(32 - 5); \
pre2(i16,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28); \
and RT0, RT0, d; \
add RT1, RT1, RT3; \
add e, e, RT0; \
ror b, #(32 - 30); \
pre3(i16,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28); \
add e, e, RT1;
#define _R_F4(a,b,c,d,e,i,pre1,pre2,pre3,i16,\
W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
_R_F2(a,b,c,d,e,i,pre1,pre2,pre3,i16,\
W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28)
#define _R(a,b,c,d,e,f,i,pre1,pre2,pre3,i16,\
W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
_R_##f(a,b,c,d,e,i,pre1,pre2,pre3,i16,\
W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28)
#define R(a,b,c,d,e,f,i) \
_R_##f(a,b,c,d,e,i,dummy,dummy,dummy,i16,\
W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28)
#define dummy(...)
/* Input expansion macros. */
/********* Precalc macros for rounds 0-15 *************************************/
#define W_PRECALC_00_15() \
add RWK, sp, #(WK_offs(0)); \
\
vld1.32 {tmp0, tmp1}, [RDATA]!; \
vrev32.8 W0, tmp0; /* big => little */ \
vld1.32 {tmp2, tmp3}, [RDATA]!; \
vadd.u32 tmp0, W0, curK; \
vrev32.8 W7, tmp1; /* big => little */ \
vrev32.8 W6, tmp2; /* big => little */ \
vadd.u32 tmp1, W7, curK; \
vrev32.8 W5, tmp3; /* big => little */ \
vadd.u32 tmp2, W6, curK; \
vst1.32 {tmp0, tmp1}, [RWK]!; \
vadd.u32 tmp3, W5, curK; \
vst1.32 {tmp2, tmp3}, [RWK]; \
#define WPRECALC_00_15_0(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vld1.32 {tmp0, tmp1}, [RDATA]!; \
#define WPRECALC_00_15_1(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
add RWK, sp, #(WK_offs(0)); \
#define WPRECALC_00_15_2(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vrev32.8 W0, tmp0; /* big => little */ \
#define WPRECALC_00_15_3(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vld1.32 {tmp2, tmp3}, [RDATA]!; \
#define WPRECALC_00_15_4(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vadd.u32 tmp0, W0, curK; \
#define WPRECALC_00_15_5(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vrev32.8 W7, tmp1; /* big => little */ \
#define WPRECALC_00_15_6(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vrev32.8 W6, tmp2; /* big => little */ \
#define WPRECALC_00_15_7(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vadd.u32 tmp1, W7, curK; \
#define WPRECALC_00_15_8(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vrev32.8 W5, tmp3; /* big => little */ \
#define WPRECALC_00_15_9(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vadd.u32 tmp2, W6, curK; \
#define WPRECALC_00_15_10(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vst1.32 {tmp0, tmp1}, [RWK]!; \
#define WPRECALC_00_15_11(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vadd.u32 tmp3, W5, curK; \
#define WPRECALC_00_15_12(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vst1.32 {tmp2, tmp3}, [RWK]; \
/********* Precalc macros for rounds 16-31 ************************************/
#define WPRECALC_16_31_0(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
veor tmp0, tmp0; \
vext.8 W, W_m16, W_m12, #8; \
#define WPRECALC_16_31_1(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
add RWK, sp, #(WK_offs(i)); \
vext.8 tmp0, W_m04, tmp0, #4; \
#define WPRECALC_16_31_2(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
veor tmp0, tmp0, W_m16; \
veor.32 W, W, W_m08; \
#define WPRECALC_16_31_3(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
veor tmp1, tmp1; \
veor W, W, tmp0; \
#define WPRECALC_16_31_4(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vshl.u32 tmp0, W, #1; \
#define WPRECALC_16_31_5(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vext.8 tmp1, tmp1, W, #(16-12); \
vshr.u32 W, W, #31; \
#define WPRECALC_16_31_6(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vorr tmp0, tmp0, W; \
vshr.u32 W, tmp1, #30; \
#define WPRECALC_16_31_7(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vshl.u32 tmp1, tmp1, #2; \
#define WPRECALC_16_31_8(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
veor tmp0, tmp0, W; \
#define WPRECALC_16_31_9(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
veor W, tmp0, tmp1; \
#define WPRECALC_16_31_10(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vadd.u32 tmp0, W, curK; \
#define WPRECALC_16_31_11(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vst1.32 {tmp0}, [RWK];
/********* Precalc macros for rounds 32-79 ************************************/
#define WPRECALC_32_79_0(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
veor W, W_m28; \
#define WPRECALC_32_79_1(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vext.8 tmp0, W_m08, W_m04, #8; \
#define WPRECALC_32_79_2(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
veor W, W_m16; \
#define WPRECALC_32_79_3(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
veor W, tmp0; \
#define WPRECALC_32_79_4(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
add RWK, sp, #(WK_offs(i&~3)); \
#define WPRECALC_32_79_5(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vshl.u32 tmp1, W, #2; \
#define WPRECALC_32_79_6(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vshr.u32 tmp0, W, #30; \
#define WPRECALC_32_79_7(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vorr W, tmp0, tmp1; \
#define WPRECALC_32_79_8(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vadd.u32 tmp0, W, curK; \
#define WPRECALC_32_79_9(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \
vst1.32 {tmp0}, [RWK];
/*
* Transform nblks*64 bytes (nblks*16 32-bit words) at DATA.
*
* unsigned int
* sha1_transform_neon (void *ctx, const unsigned char *data,
* unsigned int nblks)
*/
.align 3
ENTRY(sha1_transform_neon)
/* input:
* r0: ctx, CTX
* r1: data (64*nblks bytes)
* r2: nblks
*/
cmp RNBLKS, #0;
beq .Ldo_nothing;
push {r4-r12, lr};
/*vpush {q4-q7};*/
adr RT3, .LK_VEC;
mov ROLDSTACK, sp;
/* Align stack. */
sub RT0, sp, #(16*4);
and RT0, #(~(16-1));
mov sp, RT0;
vld1.32 {qK1-qK2}, [RT3]!; /* Load K1,K2 */
/* Get the values of the chaining variables. */
ldm RSTATE, {_a-_e};
vld1.32 {qK3-qK4}, [RT3]; /* Load K3,K4 */
#undef curK
#define curK qK1
/* Precalc 0-15. */
W_PRECALC_00_15();
.Loop:
/* Transform 0-15 + Precalc 16-31. */
_R( _a, _b, _c, _d, _e, F1, 0,
WPRECALC_16_31_0, WPRECALC_16_31_1, WPRECALC_16_31_2, 16,
W4, W5, W6, W7, W0, _, _, _ );
_R( _e, _a, _b, _c, _d, F1, 1,
WPRECALC_16_31_3, WPRECALC_16_31_4, WPRECALC_16_31_5, 16,
W4, W5, W6, W7, W0, _, _, _ );
_R( _d, _e, _a, _b, _c, F1, 2,
WPRECALC_16_31_6, WPRECALC_16_31_7, WPRECALC_16_31_8, 16,
W4, W5, W6, W7, W0, _, _, _ );
_R( _c, _d, _e, _a, _b, F1, 3,
WPRECALC_16_31_9, WPRECALC_16_31_10,WPRECALC_16_31_11,16,
W4, W5, W6, W7, W0, _, _, _ );
#undef curK
#define curK qK2
_R( _b, _c, _d, _e, _a, F1, 4,
WPRECALC_16_31_0, WPRECALC_16_31_1, WPRECALC_16_31_2, 20,
W3, W4, W5, W6, W7, _, _, _ );
_R( _a, _b, _c, _d, _e, F1, 5,
WPRECALC_16_31_3, WPRECALC_16_31_4, WPRECALC_16_31_5, 20,
W3, W4, W5, W6, W7, _, _, _ );
_R( _e, _a, _b, _c, _d, F1, 6,
WPRECALC_16_31_6, WPRECALC_16_31_7, WPRECALC_16_31_8, 20,
W3, W4, W5, W6, W7, _, _, _ );
_R( _d, _e, _a, _b, _c, F1, 7,
WPRECALC_16_31_9, WPRECALC_16_31_10,WPRECALC_16_31_11,20,
W3, W4, W5, W6, W7, _, _, _ );
_R( _c, _d, _e, _a, _b, F1, 8,
WPRECALC_16_31_0, WPRECALC_16_31_1, WPRECALC_16_31_2, 24,
W2, W3, W4, W5, W6, _, _, _ );
_R( _b, _c, _d, _e, _a, F1, 9,
WPRECALC_16_31_3, WPRECALC_16_31_4, WPRECALC_16_31_5, 24,
W2, W3, W4, W5, W6, _, _, _ );
_R( _a, _b, _c, _d, _e, F1, 10,
WPRECALC_16_31_6, WPRECALC_16_31_7, WPRECALC_16_31_8, 24,
W2, W3, W4, W5, W6, _, _, _ );
_R( _e, _a, _b, _c, _d, F1, 11,
WPRECALC_16_31_9, WPRECALC_16_31_10,WPRECALC_16_31_11,24,
W2, W3, W4, W5, W6, _, _, _ );
_R( _d, _e, _a, _b, _c, F1, 12,
WPRECALC_16_31_0, WPRECALC_16_31_1, WPRECALC_16_31_2, 28,
W1, W2, W3, W4, W5, _, _, _ );
_R( _c, _d, _e, _a, _b, F1, 13,
WPRECALC_16_31_3, WPRECALC_16_31_4, WPRECALC_16_31_5, 28,
W1, W2, W3, W4, W5, _, _, _ );
_R( _b, _c, _d, _e, _a, F1, 14,
WPRECALC_16_31_6, WPRECALC_16_31_7, WPRECALC_16_31_8, 28,
W1, W2, W3, W4, W5, _, _, _ );
_R( _a, _b, _c, _d, _e, F1, 15,
WPRECALC_16_31_9, WPRECALC_16_31_10,WPRECALC_16_31_11,28,
W1, W2, W3, W4, W5, _, _, _ );
/* Transform 16-63 + Precalc 32-79. */
_R( _e, _a, _b, _c, _d, F1, 16,
WPRECALC_32_79_0, WPRECALC_32_79_1, WPRECALC_32_79_2, 32,
W0, W1, W2, W3, W4, W5, W6, W7);
_R( _d, _e, _a, _b, _c, F1, 17,
WPRECALC_32_79_3, WPRECALC_32_79_4, WPRECALC_32_79_5, 32,
W0, W1, W2, W3, W4, W5, W6, W7);
_R( _c, _d, _e, _a, _b, F1, 18,
WPRECALC_32_79_6, dummy, WPRECALC_32_79_7, 32,
W0, W1, W2, W3, W4, W5, W6, W7);
_R( _b, _c, _d, _e, _a, F1, 19,
WPRECALC_32_79_8, dummy, WPRECALC_32_79_9, 32,
W0, W1, W2, W3, W4, W5, W6, W7);
_R( _a, _b, _c, _d, _e, F2, 20,
WPRECALC_32_79_0, WPRECALC_32_79_1, WPRECALC_32_79_2, 36,
W7, W0, W1, W2, W3, W4, W5, W6);
_R( _e, _a, _b, _c, _d, F2, 21,
WPRECALC_32_79_3, WPRECALC_32_79_4, WPRECALC_32_79_5, 36,
W7, W0, W1, W2, W3, W4, W5, W6);
_R( _d, _e, _a, _b, _c, F2, 22,
WPRECALC_32_79_6, dummy, WPRECALC_32_79_7, 36,
W7, W0, W1, W2, W3, W4, W5, W6);
_R( _c, _d, _e, _a, _b, F2, 23,
WPRECALC_32_79_8, dummy, WPRECALC_32_79_9, 36,
W7, W0, W1, W2, W3, W4, W5, W6);
#undef curK
#define curK qK3
_R( _b, _c, _d, _e, _a, F2, 24,
WPRECALC_32_79_0, WPRECALC_32_79_1, WPRECALC_32_79_2, 40,
W6, W7, W0, W1, W2, W3, W4, W5);
_R( _a, _b, _c, _d, _e, F2, 25,
WPRECALC_32_79_3, WPRECALC_32_79_4, WPRECALC_32_79_5, 40,
W6, W7, W0, W1, W2, W3, W4, W5);
_R( _e, _a, _b, _c, _d, F2, 26,
WPRECALC_32_79_6, dummy, WPRECALC_32_79_7, 40,
W6, W7, W0, W1, W2, W3, W4, W5);
_R( _d, _e, _a, _b, _c, F2, 27,
WPRECALC_32_79_8, dummy, WPRECALC_32_79_9, 40,
W6, W7, W0, W1, W2, W3, W4, W5);
_R( _c, _d, _e, _a, _b, F2, 28,
WPRECALC_32_79_0, WPRECALC_32_79_1, WPRECALC_32_79_2, 44,
W5, W6, W7, W0, W1, W2, W3, W4);
_R( _b, _c, _d, _e, _a, F2, 29,
WPRECALC_32_79_3, WPRECALC_32_79_4, WPRECALC_32_79_5, 44,
W5, W6, W7, W0, W1, W2, W3, W4);
_R( _a, _b, _c, _d, _e, F2, 30,
WPRECALC_32_79_6, dummy, WPRECALC_32_79_7, 44,
W5, W6, W7, W0, W1, W2, W3, W4);
_R( _e, _a, _b, _c, _d, F2, 31,
WPRECALC_32_79_8, dummy, WPRECALC_32_79_9, 44,
W5, W6, W7, W0, W1, W2, W3, W4);
_R( _d, _e, _a, _b, _c, F2, 32,
WPRECALC_32_79_0, WPRECALC_32_79_1, WPRECALC_32_79_2, 48,
W4, W5, W6, W7, W0, W1, W2, W3);
_R( _c, _d, _e, _a, _b, F2, 33,
WPRECALC_32_79_3, WPRECALC_32_79_4, WPRECALC_32_79_5, 48,
W4, W5, W6, W7, W0, W1, W2, W3);
_R( _b, _c, _d, _e, _a, F2, 34,
WPRECALC_32_79_6, dummy, WPRECALC_32_79_7, 48,
W4, W5, W6, W7, W0, W1, W2, W3);
_R( _a, _b, _c, _d, _e, F2, 35,
WPRECALC_32_79_8, dummy, WPRECALC_32_79_9, 48,
W4, W5, W6, W7, W0, W1, W2, W3);
_R( _e, _a, _b, _c, _d, F2, 36,
WPRECALC_32_79_0, WPRECALC_32_79_1, WPRECALC_32_79_2, 52,
W3, W4, W5, W6, W7, W0, W1, W2);
_R( _d, _e, _a, _b, _c, F2, 37,
WPRECALC_32_79_3, WPRECALC_32_79_4, WPRECALC_32_79_5, 52,
W3, W4, W5, W6, W7, W0, W1, W2);
_R( _c, _d, _e, _a, _b, F2, 38,
WPRECALC_32_79_6, dummy, WPRECALC_32_79_7, 52,
W3, W4, W5, W6, W7, W0, W1, W2);
_R( _b, _c, _d, _e, _a, F2, 39,
WPRECALC_32_79_8, dummy, WPRECALC_32_79_9, 52,
W3, W4, W5, W6, W7, W0, W1, W2);
_R( _a, _b, _c, _d, _e, F3, 40,
WPRECALC_32_79_0, WPRECALC_32_79_1, WPRECALC_32_79_2, 56,
W2, W3, W4, W5, W6, W7, W0, W1);
_R( _e, _a, _b, _c, _d, F3, 41,
WPRECALC_32_79_3, WPRECALC_32_79_4, WPRECALC_32_79_5, 56,
W2, W3, W4, W5, W6, W7, W0, W1);
_R( _d, _e, _a, _b, _c, F3, 42,
WPRECALC_32_79_6, dummy, WPRECALC_32_79_7, 56,
W2, W3, W4, W5, W6, W7, W0, W1);
_R( _c, _d, _e, _a, _b, F3, 43,
WPRECALC_32_79_8, dummy, WPRECALC_32_79_9, 56,
W2, W3, W4, W5, W6, W7, W0, W1);
#undef curK
#define curK qK4
_R( _b, _c, _d, _e, _a, F3, 44,
WPRECALC_32_79_0, WPRECALC_32_79_1, WPRECALC_32_79_2, 60,
W1, W2, W3, W4, W5, W6, W7, W0);
_R( _a, _b, _c, _d, _e, F3, 45,
WPRECALC_32_79_3, WPRECALC_32_79_4, WPRECALC_32_79_5, 60,
W1, W2, W3, W4, W5, W6, W7, W0);
_R( _e, _a, _b, _c, _d, F3, 46,
WPRECALC_32_79_6, dummy, WPRECALC_32_79_7, 60,
W1, W2, W3, W4, W5, W6, W7, W0);
_R( _d, _e, _a, _b, _c, F3, 47,
WPRECALC_32_79_8, dummy, WPRECALC_32_79_9, 60,
W1, W2, W3, W4, W5, W6, W7, W0);
_R( _c, _d, _e, _a, _b, F3, 48,
WPRECALC_32_79_0, WPRECALC_32_79_1, WPRECALC_32_79_2, 64,
W0, W1, W2, W3, W4, W5, W6, W7);
_R( _b, _c, _d, _e, _a, F3, 49,
WPRECALC_32_79_3, WPRECALC_32_79_4, WPRECALC_32_79_5, 64,
W0, W1, W2, W3, W4, W5, W6, W7);
_R( _a, _b, _c, _d, _e, F3, 50,
WPRECALC_32_79_6, dummy, WPRECALC_32_79_7, 64,
W0, W1, W2, W3, W4, W5, W6, W7);
_R( _e, _a, _b, _c, _d, F3, 51,
WPRECALC_32_79_8, dummy, WPRECALC_32_79_9, 64,
W0, W1, W2, W3, W4, W5, W6, W7);
_R( _d, _e, _a, _b, _c, F3, 52,
WPRECALC_32_79_0, WPRECALC_32_79_1, WPRECALC_32_79_2, 68,
W7, W0, W1, W2, W3, W4, W5, W6);
_R( _c, _d, _e, _a, _b, F3, 53,
WPRECALC_32_79_3, WPRECALC_32_79_4, WPRECALC_32_79_5, 68,
W7, W0, W1, W2, W3, W4, W5, W6);
_R( _b, _c, _d, _e, _a, F3, 54,
WPRECALC_32_79_6, dummy, WPRECALC_32_79_7, 68,
W7, W0, W1, W2, W3, W4, W5, W6);
_R( _a, _b, _c, _d, _e, F3, 55,
WPRECALC_32_79_8, dummy, WPRECALC_32_79_9, 68,
W7, W0, W1, W2, W3, W4, W5, W6);
_R( _e, _a, _b, _c, _d, F3, 56,
WPRECALC_32_79_0, WPRECALC_32_79_1, WPRECALC_32_79_2, 72,
W6, W7, W0, W1, W2, W3, W4, W5);
_R( _d, _e, _a, _b, _c, F3, 57,
WPRECALC_32_79_3, WPRECALC_32_79_4, WPRECALC_32_79_5, 72,
W6, W7, W0, W1, W2, W3, W4, W5);
_R( _c, _d, _e, _a, _b, F3, 58,
WPRECALC_32_79_6, dummy, WPRECALC_32_79_7, 72,
W6, W7, W0, W1, W2, W3, W4, W5);
_R( _b, _c, _d, _e, _a, F3, 59,
WPRECALC_32_79_8, dummy, WPRECALC_32_79_9, 72,
W6, W7, W0, W1, W2, W3, W4, W5);
subs RNBLKS, #1;
_R( _a, _b, _c, _d, _e, F4, 60,
WPRECALC_32_79_0, WPRECALC_32_79_1, WPRECALC_32_79_2, 76,
W5, W6, W7, W0, W1, W2, W3, W4);
_R( _e, _a, _b, _c, _d, F4, 61,
WPRECALC_32_79_3, WPRECALC_32_79_4, WPRECALC_32_79_5, 76,
W5, W6, W7, W0, W1, W2, W3, W4);
_R( _d, _e, _a, _b, _c, F4, 62,
WPRECALC_32_79_6, dummy, WPRECALC_32_79_7, 76,
W5, W6, W7, W0, W1, W2, W3, W4);
_R( _c, _d, _e, _a, _b, F4, 63,
WPRECALC_32_79_8, dummy, WPRECALC_32_79_9, 76,
W5, W6, W7, W0, W1, W2, W3, W4);
beq .Lend;
/* Transform 64-79 + Precalc 0-15 of next block. */
#undef curK
#define curK qK1
_R( _b, _c, _d, _e, _a, F4, 64,
WPRECALC_00_15_0, dummy, dummy, _, _, _, _, _, _, _, _, _ );
_R( _a, _b, _c, _d, _e, F4, 65,
WPRECALC_00_15_1, dummy, dummy, _, _, _, _, _, _, _, _, _ );
_R( _e, _a, _b, _c, _d, F4, 66,
WPRECALC_00_15_2, dummy, dummy, _, _, _, _, _, _, _, _, _ );
_R( _d, _e, _a, _b, _c, F4, 67,
WPRECALC_00_15_3, dummy, dummy, _, _, _, _, _, _, _, _, _ );
_R( _c, _d, _e, _a, _b, F4, 68,
dummy, dummy, dummy, _, _, _, _, _, _, _, _, _ );
_R( _b, _c, _d, _e, _a, F4, 69,
dummy, dummy, dummy, _, _, _, _, _, _, _, _, _ );
_R( _a, _b, _c, _d, _e, F4, 70,
WPRECALC_00_15_4, dummy, dummy, _, _, _, _, _, _, _, _, _ );
_R( _e, _a, _b, _c, _d, F4, 71,
WPRECALC_00_15_5, dummy, dummy, _, _, _, _, _, _, _, _, _ );
_R( _d, _e, _a, _b, _c, F4, 72,
dummy, dummy, dummy, _, _, _, _, _, _, _, _, _ );
_R( _c, _d, _e, _a, _b, F4, 73,
dummy, dummy, dummy, _, _, _, _, _, _, _, _, _ );
_R( _b, _c, _d, _e, _a, F4, 74,
WPRECALC_00_15_6, dummy, dummy, _, _, _, _, _, _, _, _, _ );
_R( _a, _b, _c, _d, _e, F4, 75,
WPRECALC_00_15_7, dummy, dummy, _, _, _, _, _, _, _, _, _ );
_R( _e, _a, _b, _c, _d, F4, 76,
WPRECALC_00_15_8, dummy, dummy, _, _, _, _, _, _, _, _, _ );
_R( _d, _e, _a, _b, _c, F4, 77,
WPRECALC_00_15_9, dummy, dummy, _, _, _, _, _, _, _, _, _ );
_R( _c, _d, _e, _a, _b, F4, 78,
WPRECALC_00_15_10, dummy, dummy, _, _, _, _, _, _, _, _, _ );
_R( _b, _c, _d, _e, _a, F4, 79,
WPRECALC_00_15_11, dummy, WPRECALC_00_15_12, _, _, _, _, _, _, _, _, _ );
/* Update the chaining variables. */
ldm RSTATE, {RT0-RT3};
add _a, RT0;
ldr RT0, [RSTATE, #state_h4];
add _b, RT1;
add _c, RT2;
add _d, RT3;
add _e, RT0;
stm RSTATE, {_a-_e};
b .Loop;
.Lend:
/* Transform 64-79 */
R( _b, _c, _d, _e, _a, F4, 64 );
R( _a, _b, _c, _d, _e, F4, 65 );
R( _e, _a, _b, _c, _d, F4, 66 );
R( _d, _e, _a, _b, _c, F4, 67 );
R( _c, _d, _e, _a, _b, F4, 68 );
R( _b, _c, _d, _e, _a, F4, 69 );
R( _a, _b, _c, _d, _e, F4, 70 );
R( _e, _a, _b, _c, _d, F4, 71 );
R( _d, _e, _a, _b, _c, F4, 72 );
R( _c, _d, _e, _a, _b, F4, 73 );
R( _b, _c, _d, _e, _a, F4, 74 );
R( _a, _b, _c, _d, _e, F4, 75 );
R( _e, _a, _b, _c, _d, F4, 76 );
R( _d, _e, _a, _b, _c, F4, 77 );
R( _c, _d, _e, _a, _b, F4, 78 );
R( _b, _c, _d, _e, _a, F4, 79 );
mov sp, ROLDSTACK;
/* Update the chaining variables. */
ldm RSTATE, {RT0-RT3};
add _a, RT0;
ldr RT0, [RSTATE, #state_h4];
add _b, RT1;
add _c, RT2;
add _d, RT3;
/*vpop {q4-q7};*/
add _e, RT0;
stm RSTATE, {_a-_e};
pop {r4-r12, pc};
.Ldo_nothing:
bx lr
ENDPROC(sha1_transform_neon)

View File

@@ -23,32 +23,27 @@
#include <linux/types.h>
#include <crypto/sha.h>
#include <asm/byteorder.h>
#include <asm/crypto/sha1.h>
struct SHA1_CTX {
uint32_t h0,h1,h2,h3,h4;
u64 count;
u8 data[SHA1_BLOCK_SIZE];
};
asmlinkage void sha1_block_data_order(struct SHA1_CTX *digest,
asmlinkage void sha1_block_data_order(u32 *digest,
const unsigned char *data, unsigned int rounds);
static int sha1_init(struct shash_desc *desc)
{
struct SHA1_CTX *sctx = shash_desc_ctx(desc);
memset(sctx, 0, sizeof(*sctx));
sctx->h0 = SHA1_H0;
sctx->h1 = SHA1_H1;
sctx->h2 = SHA1_H2;
sctx->h3 = SHA1_H3;
sctx->h4 = SHA1_H4;
struct sha1_state *sctx = shash_desc_ctx(desc);
*sctx = (struct sha1_state){
.state = { SHA1_H0, SHA1_H1, SHA1_H2, SHA1_H3, SHA1_H4 },
};
return 0;
}
static int __sha1_update(struct SHA1_CTX *sctx, const u8 *data,
unsigned int len, unsigned int partial)
static int __sha1_update(struct sha1_state *sctx, const u8 *data,
unsigned int len, unsigned int partial)
{
unsigned int done = 0;
@@ -56,43 +51,44 @@ static int __sha1_update(struct SHA1_CTX *sctx, const u8 *data,
if (partial) {
done = SHA1_BLOCK_SIZE - partial;
memcpy(sctx->data + partial, data, done);
sha1_block_data_order(sctx, sctx->data, 1);
memcpy(sctx->buffer + partial, data, done);
sha1_block_data_order(sctx->state, sctx->buffer, 1);
}
if (len - done >= SHA1_BLOCK_SIZE) {
const unsigned int rounds = (len - done) / SHA1_BLOCK_SIZE;
sha1_block_data_order(sctx, data + done, rounds);
sha1_block_data_order(sctx->state, data + done, rounds);
done += rounds * SHA1_BLOCK_SIZE;
}
memcpy(sctx->data, data + done, len - done);
memcpy(sctx->buffer, data + done, len - done);
return 0;
}
static int sha1_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
int sha1_update_arm(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct SHA1_CTX *sctx = shash_desc_ctx(desc);
struct sha1_state *sctx = shash_desc_ctx(desc);
unsigned int partial = sctx->count % SHA1_BLOCK_SIZE;
int res;
/* Handle the fast case right here */
if (partial + len < SHA1_BLOCK_SIZE) {
sctx->count += len;
memcpy(sctx->data + partial, data, len);
memcpy(sctx->buffer + partial, data, len);
return 0;
}
res = __sha1_update(sctx, data, len, partial);
return res;
}
EXPORT_SYMBOL_GPL(sha1_update_arm);
/* Add padding and return the message digest. */
static int sha1_final(struct shash_desc *desc, u8 *out)
{
struct SHA1_CTX *sctx = shash_desc_ctx(desc);
struct sha1_state *sctx = shash_desc_ctx(desc);
unsigned int i, index, padlen;
__be32 *dst = (__be32 *)out;
__be64 bits;
@@ -106,7 +102,7 @@ static int sha1_final(struct shash_desc *desc, u8 *out)
/* We need to fill a whole block for __sha1_update() */
if (padlen <= 56) {
sctx->count += padlen;
memcpy(sctx->data + index, padding, padlen);
memcpy(sctx->buffer + index, padding, padlen);
} else {
__sha1_update(sctx, padding, padlen, index);
}
@@ -114,7 +110,7 @@ static int sha1_final(struct shash_desc *desc, u8 *out)
/* Store state in digest */
for (i = 0; i < 5; i++)
dst[i] = cpu_to_be32(((u32 *)sctx)[i]);
dst[i] = cpu_to_be32(sctx->state[i]);
/* Wipe context */
memset(sctx, 0, sizeof(*sctx));
@@ -124,7 +120,7 @@ static int sha1_final(struct shash_desc *desc, u8 *out)
static int sha1_export(struct shash_desc *desc, void *out)
{
struct SHA1_CTX *sctx = shash_desc_ctx(desc);
struct sha1_state *sctx = shash_desc_ctx(desc);
memcpy(out, sctx, sizeof(*sctx));
return 0;
}
@@ -132,7 +128,7 @@ static int sha1_export(struct shash_desc *desc, void *out)
static int sha1_import(struct shash_desc *desc, const void *in)
{
struct SHA1_CTX *sctx = shash_desc_ctx(desc);
struct sha1_state *sctx = shash_desc_ctx(desc);
memcpy(sctx, in, sizeof(*sctx));
return 0;
}
@@ -141,12 +137,12 @@ static int sha1_import(struct shash_desc *desc, const void *in)
static struct shash_alg alg = {
.digestsize = SHA1_DIGEST_SIZE,
.init = sha1_init,
.update = sha1_update,
.update = sha1_update_arm,
.final = sha1_final,
.export = sha1_export,
.import = sha1_import,
.descsize = sizeof(struct SHA1_CTX),
.statesize = sizeof(struct SHA1_CTX),
.descsize = sizeof(struct sha1_state),
.statesize = sizeof(struct sha1_state),
.base = {
.cra_name = "sha1",
.cra_driver_name= "sha1-asm",
@@ -175,5 +171,5 @@ module_exit(sha1_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA1 Secure Hash Algorithm (ARM)");
MODULE_ALIAS("sha1");
MODULE_ALIAS_CRYPTO("sha1");
MODULE_AUTHOR("David McCullough <ucdevel@gmail.com>");

View File

@@ -0,0 +1,197 @@
/*
* Glue code for the SHA1 Secure Hash Algorithm assembler implementation using
* ARM NEON instructions.
*
* Copyright © 2014 Jussi Kivilinna <jussi.kivilinna@iki.fi>
*
* This file is based on sha1_generic.c and sha1_ssse3_glue.c:
* Copyright (c) Alan Smithee.
* Copyright (c) Andrew McDonald <andrew@mcdonald.org.uk>
* Copyright (c) Jean-Francois Dive <jef@linuxbe.org>
* Copyright (c) Mathias Krause <minipli@googlemail.com>
* Copyright (c) Chandramouli Narayanan <mouli@linux.intel.com>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the Free
* Software Foundation; either version 2 of the License, or (at your option)
* any later version.
*
*/
#include <crypto/internal/hash.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
#include <linux/cryptohash.h>
#include <linux/types.h>
#include <crypto/sha.h>
#include <asm/byteorder.h>
#include <asm/neon.h>
#include <asm/simd.h>
#include <asm/crypto/sha1.h>
asmlinkage void sha1_transform_neon(void *state_h, const char *data,
unsigned int rounds);
static int sha1_neon_init(struct shash_desc *desc)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
*sctx = (struct sha1_state){
.state = { SHA1_H0, SHA1_H1, SHA1_H2, SHA1_H3, SHA1_H4 },
};
return 0;
}
static int __sha1_neon_update(struct shash_desc *desc, const u8 *data,
unsigned int len, unsigned int partial)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
unsigned int done = 0;
sctx->count += len;
if (partial) {
done = SHA1_BLOCK_SIZE - partial;
memcpy(sctx->buffer + partial, data, done);
sha1_transform_neon(sctx->state, sctx->buffer, 1);
}
if (len - done >= SHA1_BLOCK_SIZE) {
const unsigned int rounds = (len - done) / SHA1_BLOCK_SIZE;
sha1_transform_neon(sctx->state, data + done, rounds);
done += rounds * SHA1_BLOCK_SIZE;
}
memcpy(sctx->buffer, data + done, len - done);
return 0;
}
static int sha1_neon_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
unsigned int partial = sctx->count % SHA1_BLOCK_SIZE;
int res;
/* Handle the fast case right here */
if (partial + len < SHA1_BLOCK_SIZE) {
sctx->count += len;
memcpy(sctx->buffer + partial, data, len);
return 0;
}
if (!may_use_simd()) {
res = sha1_update_arm(desc, data, len);
} else {
kernel_neon_begin();
res = __sha1_neon_update(desc, data, len, partial);
kernel_neon_end();
}
return res;
}
/* Add padding and return the message digest. */
static int sha1_neon_final(struct shash_desc *desc, u8 *out)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
unsigned int i, index, padlen;
__be32 *dst = (__be32 *)out;
__be64 bits;
static const u8 padding[SHA1_BLOCK_SIZE] = { 0x80, };
bits = cpu_to_be64(sctx->count << 3);
/* Pad out to 56 mod 64 and append length */
index = sctx->count % SHA1_BLOCK_SIZE;
padlen = (index < 56) ? (56 - index) : ((SHA1_BLOCK_SIZE+56) - index);
if (!may_use_simd()) {
sha1_update_arm(desc, padding, padlen);
sha1_update_arm(desc, (const u8 *)&bits, sizeof(bits));
} else {
kernel_neon_begin();
/* We need to fill a whole block for __sha1_neon_update() */
if (padlen <= 56) {
sctx->count += padlen;
memcpy(sctx->buffer + index, padding, padlen);
} else {
__sha1_neon_update(desc, padding, padlen, index);
}
__sha1_neon_update(desc, (const u8 *)&bits, sizeof(bits), 56);
kernel_neon_end();
}
/* Store state in digest */
for (i = 0; i < 5; i++)
dst[i] = cpu_to_be32(sctx->state[i]);
/* Wipe context */
memset(sctx, 0, sizeof(*sctx));
return 0;
}
static int sha1_neon_export(struct shash_desc *desc, void *out)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
memcpy(out, sctx, sizeof(*sctx));
return 0;
}
static int sha1_neon_import(struct shash_desc *desc, const void *in)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
memcpy(sctx, in, sizeof(*sctx));
return 0;
}
static struct shash_alg alg = {
.digestsize = SHA1_DIGEST_SIZE,
.init = sha1_neon_init,
.update = sha1_neon_update,
.final = sha1_neon_final,
.export = sha1_neon_export,
.import = sha1_neon_import,
.descsize = sizeof(struct sha1_state),
.statesize = sizeof(struct sha1_state),
.base = {
.cra_name = "sha1",
.cra_driver_name = "sha1-neon",
.cra_priority = 250,
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
.cra_blocksize = SHA1_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
};
static int __init sha1_neon_mod_init(void)
{
if (!cpu_has_neon())
return -ENODEV;
return crypto_register_shash(&alg);
}
static void __exit sha1_neon_mod_fini(void)
{
crypto_unregister_shash(&alg);
}
module_init(sha1_neon_mod_init);
module_exit(sha1_neon_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA1 Secure Hash Algorithm, NEON accelerated");
MODULE_ALIAS("sha1");

View File

@@ -0,0 +1,455 @@
/* sha512-armv7-neon.S - ARM/NEON assembly implementation of SHA-512 transform
*
* Copyright © 2013-2014 Jussi Kivilinna <jussi.kivilinna@iki.fi>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the Free
* Software Foundation; either version 2 of the License, or (at your option)
* any later version.
*/
#include <linux/linkage.h>
.syntax unified
.code 32
.fpu neon
.text
/* structure of SHA512_CONTEXT */
#define hd_a 0
#define hd_b ((hd_a) + 8)
#define hd_c ((hd_b) + 8)
#define hd_d ((hd_c) + 8)
#define hd_e ((hd_d) + 8)
#define hd_f ((hd_e) + 8)
#define hd_g ((hd_f) + 8)
/* register macros */
#define RK %r2
#define RA d0
#define RB d1
#define RC d2
#define RD d3
#define RE d4
#define RF d5
#define RG d6
#define RH d7
#define RT0 d8
#define RT1 d9
#define RT2 d10
#define RT3 d11
#define RT4 d12
#define RT5 d13
#define RT6 d14
#define RT7 d15
#define RT01q q4
#define RT23q q5
#define RT45q q6
#define RT67q q7
#define RW0 d16
#define RW1 d17
#define RW2 d18
#define RW3 d19
#define RW4 d20
#define RW5 d21
#define RW6 d22
#define RW7 d23
#define RW8 d24
#define RW9 d25
#define RW10 d26
#define RW11 d27
#define RW12 d28
#define RW13 d29
#define RW14 d30
#define RW15 d31
#define RW01q q8
#define RW23q q9
#define RW45q q10
#define RW67q q11
#define RW89q q12
#define RW1011q q13
#define RW1213q q14
#define RW1415q q15
/***********************************************************************
* ARM assembly implementation of sha512 transform
***********************************************************************/
#define rounds2_0_63(ra, rb, rc, rd, re, rf, rg, rh, rw0, rw1, rw01q, rw2, \
rw23q, rw1415q, rw9, rw10, interleave_op, arg1) \
/* t1 = h + Sum1 (e) + Ch (e, f, g) + k[t] + w[t]; */ \
vshr.u64 RT2, re, #14; \
vshl.u64 RT3, re, #64 - 14; \
interleave_op(arg1); \
vshr.u64 RT4, re, #18; \
vshl.u64 RT5, re, #64 - 18; \
vld1.64 {RT0}, [RK]!; \
veor.64 RT23q, RT23q, RT45q; \
vshr.u64 RT4, re, #41; \
vshl.u64 RT5, re, #64 - 41; \
vadd.u64 RT0, RT0, rw0; \
veor.64 RT23q, RT23q, RT45q; \
vmov.64 RT7, re; \
veor.64 RT1, RT2, RT3; \
vbsl.64 RT7, rf, rg; \
\
vadd.u64 RT1, RT1, rh; \
vshr.u64 RT2, ra, #28; \
vshl.u64 RT3, ra, #64 - 28; \
vadd.u64 RT1, RT1, RT0; \
vshr.u64 RT4, ra, #34; \
vshl.u64 RT5, ra, #64 - 34; \
vadd.u64 RT1, RT1, RT7; \
\
/* h = Sum0 (a) + Maj (a, b, c); */ \
veor.64 RT23q, RT23q, RT45q; \
vshr.u64 RT4, ra, #39; \
vshl.u64 RT5, ra, #64 - 39; \
veor.64 RT0, ra, rb; \
veor.64 RT23q, RT23q, RT45q; \
vbsl.64 RT0, rc, rb; \
vadd.u64 rd, rd, RT1; /* d+=t1; */ \
veor.64 rh, RT2, RT3; \
\
/* t1 = g + Sum1 (d) + Ch (d, e, f) + k[t] + w[t]; */ \
vshr.u64 RT2, rd, #14; \
vshl.u64 RT3, rd, #64 - 14; \
vadd.u64 rh, rh, RT0; \
vshr.u64 RT4, rd, #18; \
vshl.u64 RT5, rd, #64 - 18; \
vadd.u64 rh, rh, RT1; /* h+=t1; */ \
vld1.64 {RT0}, [RK]!; \
veor.64 RT23q, RT23q, RT45q; \
vshr.u64 RT4, rd, #41; \
vshl.u64 RT5, rd, #64 - 41; \
vadd.u64 RT0, RT0, rw1; \
veor.64 RT23q, RT23q, RT45q; \
vmov.64 RT7, rd; \
veor.64 RT1, RT2, RT3; \
vbsl.64 RT7, re, rf; \
\
vadd.u64 RT1, RT1, rg; \
vshr.u64 RT2, rh, #28; \
vshl.u64 RT3, rh, #64 - 28; \
vadd.u64 RT1, RT1, RT0; \
vshr.u64 RT4, rh, #34; \
vshl.u64 RT5, rh, #64 - 34; \
vadd.u64 RT1, RT1, RT7; \
\
/* g = Sum0 (h) + Maj (h, a, b); */ \
veor.64 RT23q, RT23q, RT45q; \
vshr.u64 RT4, rh, #39; \
vshl.u64 RT5, rh, #64 - 39; \
veor.64 RT0, rh, ra; \
veor.64 RT23q, RT23q, RT45q; \
vbsl.64 RT0, rb, ra; \
vadd.u64 rc, rc, RT1; /* c+=t1; */ \
veor.64 rg, RT2, RT3; \
\
/* w[0] += S1 (w[14]) + w[9] + S0 (w[1]); */ \
/* w[1] += S1 (w[15]) + w[10] + S0 (w[2]); */ \
\
/**** S0(w[1:2]) */ \
\
/* w[0:1] += w[9:10] */ \
/* RT23q = rw1:rw2 */ \
vext.u64 RT23q, rw01q, rw23q, #1; \
vadd.u64 rw0, rw9; \
vadd.u64 rg, rg, RT0; \
vadd.u64 rw1, rw10;\
vadd.u64 rg, rg, RT1; /* g+=t1; */ \
\
vshr.u64 RT45q, RT23q, #1; \
vshl.u64 RT67q, RT23q, #64 - 1; \
vshr.u64 RT01q, RT23q, #8; \
veor.u64 RT45q, RT45q, RT67q; \
vshl.u64 RT67q, RT23q, #64 - 8; \
veor.u64 RT45q, RT45q, RT01q; \
vshr.u64 RT01q, RT23q, #7; \
veor.u64 RT45q, RT45q, RT67q; \
\
/**** S1(w[14:15]) */ \
vshr.u64 RT23q, rw1415q, #6; \
veor.u64 RT01q, RT01q, RT45q; \
vshr.u64 RT45q, rw1415q, #19; \
vshl.u64 RT67q, rw1415q, #64 - 19; \
veor.u64 RT23q, RT23q, RT45q; \
vshr.u64 RT45q, rw1415q, #61; \
veor.u64 RT23q, RT23q, RT67q; \
vshl.u64 RT67q, rw1415q, #64 - 61; \
veor.u64 RT23q, RT23q, RT45q; \
vadd.u64 rw01q, RT01q; /* w[0:1] += S(w[1:2]) */ \
veor.u64 RT01q, RT23q, RT67q;
#define vadd_RT01q(rw01q) \
/* w[0:1] += S(w[14:15]) */ \
vadd.u64 rw01q, RT01q;
#define dummy(_) /*_*/
#define rounds2_64_79(ra, rb, rc, rd, re, rf, rg, rh, rw0, rw1, \
interleave_op1, arg1, interleave_op2, arg2) \
/* t1 = h + Sum1 (e) + Ch (e, f, g) + k[t] + w[t]; */ \
vshr.u64 RT2, re, #14; \
vshl.u64 RT3, re, #64 - 14; \
interleave_op1(arg1); \
vshr.u64 RT4, re, #18; \
vshl.u64 RT5, re, #64 - 18; \
interleave_op2(arg2); \
vld1.64 {RT0}, [RK]!; \
veor.64 RT23q, RT23q, RT45q; \
vshr.u64 RT4, re, #41; \
vshl.u64 RT5, re, #64 - 41; \
vadd.u64 RT0, RT0, rw0; \
veor.64 RT23q, RT23q, RT45q; \
vmov.64 RT7, re; \
veor.64 RT1, RT2, RT3; \
vbsl.64 RT7, rf, rg; \
\
vadd.u64 RT1, RT1, rh; \
vshr.u64 RT2, ra, #28; \
vshl.u64 RT3, ra, #64 - 28; \
vadd.u64 RT1, RT1, RT0; \
vshr.u64 RT4, ra, #34; \
vshl.u64 RT5, ra, #64 - 34; \
vadd.u64 RT1, RT1, RT7; \
\
/* h = Sum0 (a) + Maj (a, b, c); */ \
veor.64 RT23q, RT23q, RT45q; \
vshr.u64 RT4, ra, #39; \
vshl.u64 RT5, ra, #64 - 39; \
veor.64 RT0, ra, rb; \
veor.64 RT23q, RT23q, RT45q; \
vbsl.64 RT0, rc, rb; \
vadd.u64 rd, rd, RT1; /* d+=t1; */ \
veor.64 rh, RT2, RT3; \
\
/* t1 = g + Sum1 (d) + Ch (d, e, f) + k[t] + w[t]; */ \
vshr.u64 RT2, rd, #14; \
vshl.u64 RT3, rd, #64 - 14; \
vadd.u64 rh, rh, RT0; \
vshr.u64 RT4, rd, #18; \
vshl.u64 RT5, rd, #64 - 18; \
vadd.u64 rh, rh, RT1; /* h+=t1; */ \
vld1.64 {RT0}, [RK]!; \
veor.64 RT23q, RT23q, RT45q; \
vshr.u64 RT4, rd, #41; \
vshl.u64 RT5, rd, #64 - 41; \
vadd.u64 RT0, RT0, rw1; \
veor.64 RT23q, RT23q, RT45q; \
vmov.64 RT7, rd; \
veor.64 RT1, RT2, RT3; \
vbsl.64 RT7, re, rf; \
\
vadd.u64 RT1, RT1, rg; \
vshr.u64 RT2, rh, #28; \
vshl.u64 RT3, rh, #64 - 28; \
vadd.u64 RT1, RT1, RT0; \
vshr.u64 RT4, rh, #34; \
vshl.u64 RT5, rh, #64 - 34; \
vadd.u64 RT1, RT1, RT7; \
\
/* g = Sum0 (h) + Maj (h, a, b); */ \
veor.64 RT23q, RT23q, RT45q; \
vshr.u64 RT4, rh, #39; \
vshl.u64 RT5, rh, #64 - 39; \
veor.64 RT0, rh, ra; \
veor.64 RT23q, RT23q, RT45q; \
vbsl.64 RT0, rb, ra; \
vadd.u64 rc, rc, RT1; /* c+=t1; */ \
veor.64 rg, RT2, RT3;
#define vadd_rg_RT0(rg) \
vadd.u64 rg, rg, RT0;
#define vadd_rg_RT1(rg) \
vadd.u64 rg, rg, RT1; /* g+=t1; */
.align 3
ENTRY(sha512_transform_neon)
/* Input:
* %r0: SHA512_CONTEXT
* %r1: data
* %r2: u64 k[] constants
* %r3: nblks
*/
push {%lr};
mov %lr, #0;
/* Load context to d0-d7 */
vld1.64 {RA-RD}, [%r0]!;
vld1.64 {RE-RH}, [%r0];
sub %r0, #(4*8);
/* Load input to w[16], d16-d31 */
/* NOTE: Assumes that on ARMv7 unaligned accesses are always allowed. */
vld1.64 {RW0-RW3}, [%r1]!;
vld1.64 {RW4-RW7}, [%r1]!;
vld1.64 {RW8-RW11}, [%r1]!;
vld1.64 {RW12-RW15}, [%r1]!;
#ifdef __ARMEL__
/* byteswap */
vrev64.8 RW01q, RW01q;
vrev64.8 RW23q, RW23q;
vrev64.8 RW45q, RW45q;
vrev64.8 RW67q, RW67q;
vrev64.8 RW89q, RW89q;
vrev64.8 RW1011q, RW1011q;
vrev64.8 RW1213q, RW1213q;
vrev64.8 RW1415q, RW1415q;
#endif
/* EABI says that d8-d15 must be preserved by callee. */
/*vpush {RT0-RT7};*/
.Loop:
rounds2_0_63(RA, RB, RC, RD, RE, RF, RG, RH, RW0, RW1, RW01q, RW2,
RW23q, RW1415q, RW9, RW10, dummy, _);
b .Lenter_rounds;
.Loop_rounds:
rounds2_0_63(RA, RB, RC, RD, RE, RF, RG, RH, RW0, RW1, RW01q, RW2,
RW23q, RW1415q, RW9, RW10, vadd_RT01q, RW1415q);
.Lenter_rounds:
rounds2_0_63(RG, RH, RA, RB, RC, RD, RE, RF, RW2, RW3, RW23q, RW4,
RW45q, RW01q, RW11, RW12, vadd_RT01q, RW01q);
rounds2_0_63(RE, RF, RG, RH, RA, RB, RC, RD, RW4, RW5, RW45q, RW6,
RW67q, RW23q, RW13, RW14, vadd_RT01q, RW23q);
rounds2_0_63(RC, RD, RE, RF, RG, RH, RA, RB, RW6, RW7, RW67q, RW8,
RW89q, RW45q, RW15, RW0, vadd_RT01q, RW45q);
rounds2_0_63(RA, RB, RC, RD, RE, RF, RG, RH, RW8, RW9, RW89q, RW10,
RW1011q, RW67q, RW1, RW2, vadd_RT01q, RW67q);
rounds2_0_63(RG, RH, RA, RB, RC, RD, RE, RF, RW10, RW11, RW1011q, RW12,
RW1213q, RW89q, RW3, RW4, vadd_RT01q, RW89q);
add %lr, #16;
rounds2_0_63(RE, RF, RG, RH, RA, RB, RC, RD, RW12, RW13, RW1213q, RW14,
RW1415q, RW1011q, RW5, RW6, vadd_RT01q, RW1011q);
cmp %lr, #64;
rounds2_0_63(RC, RD, RE, RF, RG, RH, RA, RB, RW14, RW15, RW1415q, RW0,
RW01q, RW1213q, RW7, RW8, vadd_RT01q, RW1213q);
bne .Loop_rounds;
subs %r3, #1;
rounds2_64_79(RA, RB, RC, RD, RE, RF, RG, RH, RW0, RW1,
vadd_RT01q, RW1415q, dummy, _);
rounds2_64_79(RG, RH, RA, RB, RC, RD, RE, RF, RW2, RW3,
vadd_rg_RT0, RG, vadd_rg_RT1, RG);
beq .Lhandle_tail;
vld1.64 {RW0-RW3}, [%r1]!;
rounds2_64_79(RE, RF, RG, RH, RA, RB, RC, RD, RW4, RW5,
vadd_rg_RT0, RE, vadd_rg_RT1, RE);
rounds2_64_79(RC, RD, RE, RF, RG, RH, RA, RB, RW6, RW7,
vadd_rg_RT0, RC, vadd_rg_RT1, RC);
#ifdef __ARMEL__
vrev64.8 RW01q, RW01q;
vrev64.8 RW23q, RW23q;
#endif
vld1.64 {RW4-RW7}, [%r1]!;
rounds2_64_79(RA, RB, RC, RD, RE, RF, RG, RH, RW8, RW9,
vadd_rg_RT0, RA, vadd_rg_RT1, RA);
rounds2_64_79(RG, RH, RA, RB, RC, RD, RE, RF, RW10, RW11,
vadd_rg_RT0, RG, vadd_rg_RT1, RG);
#ifdef __ARMEL__
vrev64.8 RW45q, RW45q;
vrev64.8 RW67q, RW67q;
#endif
vld1.64 {RW8-RW11}, [%r1]!;
rounds2_64_79(RE, RF, RG, RH, RA, RB, RC, RD, RW12, RW13,
vadd_rg_RT0, RE, vadd_rg_RT1, RE);
rounds2_64_79(RC, RD, RE, RF, RG, RH, RA, RB, RW14, RW15,
vadd_rg_RT0, RC, vadd_rg_RT1, RC);
#ifdef __ARMEL__
vrev64.8 RW89q, RW89q;
vrev64.8 RW1011q, RW1011q;
#endif
vld1.64 {RW12-RW15}, [%r1]!;
vadd_rg_RT0(RA);
vadd_rg_RT1(RA);
/* Load context */
vld1.64 {RT0-RT3}, [%r0]!;
vld1.64 {RT4-RT7}, [%r0];
sub %r0, #(4*8);
#ifdef __ARMEL__
vrev64.8 RW1213q, RW1213q;
vrev64.8 RW1415q, RW1415q;
#endif
vadd.u64 RA, RT0;
vadd.u64 RB, RT1;
vadd.u64 RC, RT2;
vadd.u64 RD, RT3;
vadd.u64 RE, RT4;
vadd.u64 RF, RT5;
vadd.u64 RG, RT6;
vadd.u64 RH, RT7;
/* Store the first half of context */
vst1.64 {RA-RD}, [%r0]!;
sub RK, $(8*80);
vst1.64 {RE-RH}, [%r0]; /* Store the last half of context */
mov %lr, #0;
sub %r0, #(4*8);
b .Loop;
.Lhandle_tail:
rounds2_64_79(RE, RF, RG, RH, RA, RB, RC, RD, RW4, RW5,
vadd_rg_RT0, RE, vadd_rg_RT1, RE);
rounds2_64_79(RC, RD, RE, RF, RG, RH, RA, RB, RW6, RW7,
vadd_rg_RT0, RC, vadd_rg_RT1, RC);
rounds2_64_79(RA, RB, RC, RD, RE, RF, RG, RH, RW8, RW9,
vadd_rg_RT0, RA, vadd_rg_RT1, RA);
rounds2_64_79(RG, RH, RA, RB, RC, RD, RE, RF, RW10, RW11,
vadd_rg_RT0, RG, vadd_rg_RT1, RG);
rounds2_64_79(RE, RF, RG, RH, RA, RB, RC, RD, RW12, RW13,
vadd_rg_RT0, RE, vadd_rg_RT1, RE);
rounds2_64_79(RC, RD, RE, RF, RG, RH, RA, RB, RW14, RW15,
vadd_rg_RT0, RC, vadd_rg_RT1, RC);
/* Load context to d16-d23 */
vld1.64 {RW0-RW3}, [%r0]!;
vadd_rg_RT0(RA);
vld1.64 {RW4-RW7}, [%r0];
vadd_rg_RT1(RA);
sub %r0, #(4*8);
vadd.u64 RA, RW0;
vadd.u64 RB, RW1;
vadd.u64 RC, RW2;
vadd.u64 RD, RW3;
vadd.u64 RE, RW4;
vadd.u64 RF, RW5;
vadd.u64 RG, RW6;
vadd.u64 RH, RW7;
/* Store the first half of context */
vst1.64 {RA-RD}, [%r0]!;
/* Clear used registers */
/* d16-d31 */
veor.u64 RW01q, RW01q;
veor.u64 RW23q, RW23q;
veor.u64 RW45q, RW45q;
veor.u64 RW67q, RW67q;
vst1.64 {RE-RH}, [%r0]; /* Store the last half of context */
veor.u64 RW89q, RW89q;
veor.u64 RW1011q, RW1011q;
veor.u64 RW1213q, RW1213q;
veor.u64 RW1415q, RW1415q;
/* d8-d15 */
/*vpop {RT0-RT7};*/
/* d0-d7 (q0-q3) */
veor.u64 %q0, %q0;
veor.u64 %q1, %q1;
veor.u64 %q2, %q2;
veor.u64 %q3, %q3;
pop {%pc};
ENDPROC(sha512_transform_neon)

View File

@@ -0,0 +1,305 @@
/*
* Glue code for the SHA512 Secure Hash Algorithm assembly implementation
* using NEON instructions.
*
* Copyright © 2014 Jussi Kivilinna <jussi.kivilinna@iki.fi>
*
* This file is based on sha512_ssse3_glue.c:
* Copyright (C) 2013 Intel Corporation
* Author: Tim Chen <tim.c.chen@linux.intel.com>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the Free
* Software Foundation; either version 2 of the License, or (at your option)
* any later version.
*
*/
#include <crypto/internal/hash.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
#include <linux/cryptohash.h>
#include <linux/types.h>
#include <linux/string.h>
#include <crypto/sha.h>
#include <asm/byteorder.h>
#include <asm/simd.h>
#include <asm/neon.h>
static const u64 sha512_k[] = {
0x428a2f98d728ae22ULL, 0x7137449123ef65cdULL,
0xb5c0fbcfec4d3b2fULL, 0xe9b5dba58189dbbcULL,
0x3956c25bf348b538ULL, 0x59f111f1b605d019ULL,
0x923f82a4af194f9bULL, 0xab1c5ed5da6d8118ULL,
0xd807aa98a3030242ULL, 0x12835b0145706fbeULL,
0x243185be4ee4b28cULL, 0x550c7dc3d5ffb4e2ULL,
0x72be5d74f27b896fULL, 0x80deb1fe3b1696b1ULL,
0x9bdc06a725c71235ULL, 0xc19bf174cf692694ULL,
0xe49b69c19ef14ad2ULL, 0xefbe4786384f25e3ULL,
0x0fc19dc68b8cd5b5ULL, 0x240ca1cc77ac9c65ULL,
0x2de92c6f592b0275ULL, 0x4a7484aa6ea6e483ULL,
0x5cb0a9dcbd41fbd4ULL, 0x76f988da831153b5ULL,
0x983e5152ee66dfabULL, 0xa831c66d2db43210ULL,
0xb00327c898fb213fULL, 0xbf597fc7beef0ee4ULL,
0xc6e00bf33da88fc2ULL, 0xd5a79147930aa725ULL,
0x06ca6351e003826fULL, 0x142929670a0e6e70ULL,
0x27b70a8546d22ffcULL, 0x2e1b21385c26c926ULL,
0x4d2c6dfc5ac42aedULL, 0x53380d139d95b3dfULL,
0x650a73548baf63deULL, 0x766a0abb3c77b2a8ULL,
0x81c2c92e47edaee6ULL, 0x92722c851482353bULL,
0xa2bfe8a14cf10364ULL, 0xa81a664bbc423001ULL,
0xc24b8b70d0f89791ULL, 0xc76c51a30654be30ULL,
0xd192e819d6ef5218ULL, 0xd69906245565a910ULL,
0xf40e35855771202aULL, 0x106aa07032bbd1b8ULL,
0x19a4c116b8d2d0c8ULL, 0x1e376c085141ab53ULL,
0x2748774cdf8eeb99ULL, 0x34b0bcb5e19b48a8ULL,
0x391c0cb3c5c95a63ULL, 0x4ed8aa4ae3418acbULL,
0x5b9cca4f7763e373ULL, 0x682e6ff3d6b2b8a3ULL,
0x748f82ee5defb2fcULL, 0x78a5636f43172f60ULL,
0x84c87814a1f0ab72ULL, 0x8cc702081a6439ecULL,
0x90befffa23631e28ULL, 0xa4506cebde82bde9ULL,
0xbef9a3f7b2c67915ULL, 0xc67178f2e372532bULL,
0xca273eceea26619cULL, 0xd186b8c721c0c207ULL,
0xeada7dd6cde0eb1eULL, 0xf57d4f7fee6ed178ULL,
0x06f067aa72176fbaULL, 0x0a637dc5a2c898a6ULL,
0x113f9804bef90daeULL, 0x1b710b35131c471bULL,
0x28db77f523047d84ULL, 0x32caab7b40c72493ULL,
0x3c9ebe0a15c9bebcULL, 0x431d67c49c100d4cULL,
0x4cc5d4becb3e42b6ULL, 0x597f299cfc657e2aULL,
0x5fcb6fab3ad6faecULL, 0x6c44198c4a475817ULL
};
asmlinkage void sha512_transform_neon(u64 *digest, const void *data,
const u64 k[], unsigned int num_blks);
static int sha512_neon_init(struct shash_desc *desc)
{
struct sha512_state *sctx = shash_desc_ctx(desc);
sctx->state[0] = SHA512_H0;
sctx->state[1] = SHA512_H1;
sctx->state[2] = SHA512_H2;
sctx->state[3] = SHA512_H3;
sctx->state[4] = SHA512_H4;
sctx->state[5] = SHA512_H5;
sctx->state[6] = SHA512_H6;
sctx->state[7] = SHA512_H7;
sctx->count[0] = sctx->count[1] = 0;
return 0;
}
static int __sha512_neon_update(struct shash_desc *desc, const u8 *data,
unsigned int len, unsigned int partial)
{
struct sha512_state *sctx = shash_desc_ctx(desc);
unsigned int done = 0;
sctx->count[0] += len;
if (sctx->count[0] < len)
sctx->count[1]++;
if (partial) {
done = SHA512_BLOCK_SIZE - partial;
memcpy(sctx->buf + partial, data, done);
sha512_transform_neon(sctx->state, sctx->buf, sha512_k, 1);
}
if (len - done >= SHA512_BLOCK_SIZE) {
const unsigned int rounds = (len - done) / SHA512_BLOCK_SIZE;
sha512_transform_neon(sctx->state, data + done, sha512_k,
rounds);
done += rounds * SHA512_BLOCK_SIZE;
}
memcpy(sctx->buf, data + done, len - done);
return 0;
}
static int sha512_neon_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct sha512_state *sctx = shash_desc_ctx(desc);
unsigned int partial = sctx->count[0] % SHA512_BLOCK_SIZE;
int res;
/* Handle the fast case right here */
if (partial + len < SHA512_BLOCK_SIZE) {
sctx->count[0] += len;
if (sctx->count[0] < len)
sctx->count[1]++;
memcpy(sctx->buf + partial, data, len);
return 0;
}
if (!may_use_simd()) {
res = crypto_sha512_update(desc, data, len);
} else {
kernel_neon_begin();
res = __sha512_neon_update(desc, data, len, partial);
kernel_neon_end();
}
return res;
}
/* Add padding and return the message digest. */
static int sha512_neon_final(struct shash_desc *desc, u8 *out)
{
struct sha512_state *sctx = shash_desc_ctx(desc);
unsigned int i, index, padlen;
__be64 *dst = (__be64 *)out;
__be64 bits[2];
static const u8 padding[SHA512_BLOCK_SIZE] = { 0x80, };
/* save number of bits */
bits[1] = cpu_to_be64(sctx->count[0] << 3);
bits[0] = cpu_to_be64(sctx->count[1] << 3 | sctx->count[0] >> 61);
/* Pad out to 112 mod 128 and append length */
index = sctx->count[0] & 0x7f;
padlen = (index < 112) ? (112 - index) : ((128+112) - index);
if (!may_use_simd()) {
crypto_sha512_update(desc, padding, padlen);
crypto_sha512_update(desc, (const u8 *)&bits, sizeof(bits));
} else {
kernel_neon_begin();
/* We need to fill a whole block for __sha512_neon_update() */
if (padlen <= 112) {
sctx->count[0] += padlen;
if (sctx->count[0] < padlen)
sctx->count[1]++;
memcpy(sctx->buf + index, padding, padlen);
} else {
__sha512_neon_update(desc, padding, padlen, index);
}
__sha512_neon_update(desc, (const u8 *)&bits,
sizeof(bits), 112);
kernel_neon_end();
}
/* Store state in digest */
for (i = 0; i < 8; i++)
dst[i] = cpu_to_be64(sctx->state[i]);
/* Wipe context */
memset(sctx, 0, sizeof(*sctx));
return 0;
}
static int sha512_neon_export(struct shash_desc *desc, void *out)
{
struct sha512_state *sctx = shash_desc_ctx(desc);
memcpy(out, sctx, sizeof(*sctx));
return 0;
}
static int sha512_neon_import(struct shash_desc *desc, const void *in)
{
struct sha512_state *sctx = shash_desc_ctx(desc);
memcpy(sctx, in, sizeof(*sctx));
return 0;
}
static int sha384_neon_init(struct shash_desc *desc)
{
struct sha512_state *sctx = shash_desc_ctx(desc);
sctx->state[0] = SHA384_H0;
sctx->state[1] = SHA384_H1;
sctx->state[2] = SHA384_H2;
sctx->state[3] = SHA384_H3;
sctx->state[4] = SHA384_H4;
sctx->state[5] = SHA384_H5;
sctx->state[6] = SHA384_H6;
sctx->state[7] = SHA384_H7;
sctx->count[0] = sctx->count[1] = 0;
return 0;
}
static int sha384_neon_final(struct shash_desc *desc, u8 *hash)
{
u8 D[SHA512_DIGEST_SIZE];
sha512_neon_final(desc, D);
memcpy(hash, D, SHA384_DIGEST_SIZE);
memset(D, 0, SHA512_DIGEST_SIZE);
return 0;
}
static struct shash_alg algs[] = { {
.digestsize = SHA512_DIGEST_SIZE,
.init = sha512_neon_init,
.update = sha512_neon_update,
.final = sha512_neon_final,
.export = sha512_neon_export,
.import = sha512_neon_import,
.descsize = sizeof(struct sha512_state),
.statesize = sizeof(struct sha512_state),
.base = {
.cra_name = "sha512",
.cra_driver_name = "sha512-neon",
.cra_priority = 250,
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
.cra_blocksize = SHA512_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
}, {
.digestsize = SHA384_DIGEST_SIZE,
.init = sha384_neon_init,
.update = sha512_neon_update,
.final = sha384_neon_final,
.export = sha512_neon_export,
.import = sha512_neon_import,
.descsize = sizeof(struct sha512_state),
.statesize = sizeof(struct sha512_state),
.base = {
.cra_name = "sha384",
.cra_driver_name = "sha384-neon",
.cra_priority = 250,
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
.cra_blocksize = SHA384_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
} };
static int __init sha512_neon_mod_init(void)
{
if (!cpu_has_neon())
return -ENODEV;
return crypto_register_shashes(algs, ARRAY_SIZE(algs));
}
static void __exit sha512_neon_mod_fini(void)
{
crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
}
module_init(sha512_neon_mod_init);
module_exit(sha512_neon_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA512 Secure Hash Algorithm, NEON accelerated");
MODULE_ALIAS("sha512");
MODULE_ALIAS("sha384");

View File

@@ -24,6 +24,7 @@ generic-y += sembuf.h
generic-y += serial.h
generic-y += shmbuf.h
generic-y += siginfo.h
generic-y += simd.h
generic-y += sizes.h
generic-y += socket.h
generic-y += sockios.h

View File

@@ -114,7 +114,8 @@ static inline int atomic_sub_return(int i, atomic_t *v)
static inline int atomic_cmpxchg(atomic_t *ptr, int old, int new)
{
unsigned long oldval, res;
int oldval;
unsigned long res;
smp_mb();
@@ -238,15 +239,15 @@ static inline int __atomic_add_unless(atomic_t *v, int a, int u)
#ifndef CONFIG_GENERIC_ATOMIC64
typedef struct {
u64 __aligned(8) counter;
long long counter;
} atomic64_t;
#define ATOMIC64_INIT(i) { (i) }
#ifdef CONFIG_ARM_LPAE
static inline u64 atomic64_read(const atomic64_t *v)
static inline long long atomic64_read(const atomic64_t *v)
{
u64 result;
long long result;
__asm__ __volatile__("@ atomic64_read\n"
" ldrd %0, %H0, [%1]"
@@ -257,7 +258,7 @@ static inline u64 atomic64_read(const atomic64_t *v)
return result;
}
static inline void atomic64_set(atomic64_t *v, u64 i)
static inline void atomic64_set(atomic64_t *v, long long i)
{
__asm__ __volatile__("@ atomic64_set\n"
" strd %2, %H2, [%1]"
@@ -266,9 +267,9 @@ static inline void atomic64_set(atomic64_t *v, u64 i)
);
}
#else
static inline u64 atomic64_read(const atomic64_t *v)
static inline long long atomic64_read(const atomic64_t *v)
{
u64 result;
long long result;
__asm__ __volatile__("@ atomic64_read\n"
" ldrexd %0, %H0, [%1]"
@@ -279,9 +280,9 @@ static inline u64 atomic64_read(const atomic64_t *v)
return result;
}
static inline void atomic64_set(atomic64_t *v, u64 i)
static inline void atomic64_set(atomic64_t *v, long long i)
{
u64 tmp;
long long tmp;
__asm__ __volatile__("@ atomic64_set\n"
"1: ldrexd %0, %H0, [%2]\n"
@@ -294,9 +295,9 @@ static inline void atomic64_set(atomic64_t *v, u64 i)
}
#endif
static inline void atomic64_add(u64 i, atomic64_t *v)
static inline void atomic64_add(long long i, atomic64_t *v)
{
u64 result;
long long result;
unsigned long tmp;
__asm__ __volatile__("@ atomic64_add\n"
@@ -311,9 +312,9 @@ static inline void atomic64_add(u64 i, atomic64_t *v)
: "cc");
}
static inline u64 atomic64_add_return(u64 i, atomic64_t *v)
static inline long long atomic64_add_return(long long i, atomic64_t *v)
{
u64 result;
long long result;
unsigned long tmp;
smp_mb();
@@ -334,9 +335,9 @@ static inline u64 atomic64_add_return(u64 i, atomic64_t *v)
return result;
}
static inline void atomic64_sub(u64 i, atomic64_t *v)
static inline void atomic64_sub(long long i, atomic64_t *v)
{
u64 result;
long long result;
unsigned long tmp;
__asm__ __volatile__("@ atomic64_sub\n"
@@ -351,9 +352,9 @@ static inline void atomic64_sub(u64 i, atomic64_t *v)
: "cc");
}
static inline u64 atomic64_sub_return(u64 i, atomic64_t *v)
static inline long long atomic64_sub_return(long long i, atomic64_t *v)
{
u64 result;
long long result;
unsigned long tmp;
smp_mb();
@@ -374,9 +375,10 @@ static inline u64 atomic64_sub_return(u64 i, atomic64_t *v)
return result;
}
static inline u64 atomic64_cmpxchg(atomic64_t *ptr, u64 old, u64 new)
static inline long long atomic64_cmpxchg(atomic64_t *ptr, long long old,
long long new)
{
u64 oldval;
long long oldval;
unsigned long res;
smp_mb();
@@ -398,9 +400,9 @@ static inline u64 atomic64_cmpxchg(atomic64_t *ptr, u64 old, u64 new)
return oldval;
}
static inline u64 atomic64_xchg(atomic64_t *ptr, u64 new)
static inline long long atomic64_xchg(atomic64_t *ptr, long long new)
{
u64 result;
long long result;
unsigned long tmp;
smp_mb();
@@ -419,9 +421,9 @@ static inline u64 atomic64_xchg(atomic64_t *ptr, u64 new)
return result;
}
static inline u64 atomic64_dec_if_positive(atomic64_t *v)
static inline long long atomic64_dec_if_positive(atomic64_t *v)
{
u64 result;
long long result;
unsigned long tmp;
smp_mb();
@@ -445,9 +447,9 @@ static inline u64 atomic64_dec_if_positive(atomic64_t *v)
return result;
}
static inline int atomic64_add_unless(atomic64_t *v, u64 a, u64 u)
static inline int atomic64_add_unless(atomic64_t *v, long long a, long long u)
{
u64 val;
long long val;
unsigned long tmp;
int ret = 1;

View File

@@ -0,0 +1,10 @@
#ifndef ASM_ARM_CRYPTO_SHA1_H
#define ASM_ARM_CRYPTO_SHA1_H
#include <linux/crypto.h>
#include <crypto/sha.h>
extern int sha1_update_arm(struct shash_desc *desc, const u8 *data,
unsigned int len);
#endif

View File

@@ -98,23 +98,19 @@
#define TASK_UNMAPPED_BASE UL(0x00000000)
#endif
#ifndef PHYS_OFFSET
#define PHYS_OFFSET UL(CONFIG_DRAM_BASE)
#endif
#ifndef END_MEM
#define END_MEM (UL(CONFIG_DRAM_BASE) + CONFIG_DRAM_SIZE)
#endif
#ifndef PAGE_OFFSET
#define PAGE_OFFSET (PHYS_OFFSET)
#define PAGE_OFFSET PLAT_PHYS_OFFSET
#endif
/*
* The module can be at any place in ram in nommu mode.
*/
#define MODULES_END (END_MEM)
#define MODULES_VADDR (PHYS_OFFSET)
#define MODULES_VADDR PAGE_OFFSET
#define XIP_VIRT_ADDR(physaddr) (physaddr)
@@ -141,6 +137,16 @@
#define page_to_phys(page) (__pfn_to_phys(page_to_pfn(page)))
#define phys_to_page(phys) (pfn_to_page(__phys_to_pfn(phys)))
/*
* PLAT_PHYS_OFFSET is the offset (from zero) of the start of physical
* memory. This is used for XIP and NoMMU kernels, or by kernels which
* have their own mach/memory.h. Assembly code must always use
* PLAT_PHYS_OFFSET and not PHYS_OFFSET.
*/
#ifndef PLAT_PHYS_OFFSET
#define PLAT_PHYS_OFFSET UL(CONFIG_PHYS_OFFSET)
#endif
#ifndef __ASSEMBLY__
/*
@@ -184,22 +190,15 @@ static inline unsigned long __phys_to_virt(unsigned long x)
return t;
}
#else
#define PHYS_OFFSET PLAT_PHYS_OFFSET
#define __virt_to_phys(x) ((x) - PAGE_OFFSET + PHYS_OFFSET)
#define __phys_to_virt(x) ((x) - PHYS_OFFSET + PAGE_OFFSET)
#endif
#endif
#endif /* __ASSEMBLY__ */
#ifndef PHYS_OFFSET
#ifdef PLAT_PHYS_OFFSET
#define PHYS_OFFSET PLAT_PHYS_OFFSET
#else
#define PHYS_OFFSET UL(CONFIG_PHYS_OFFSET)
#endif
#endif
#ifndef __ASSEMBLY__
/*
* PFNs are used to describe any physical page; this means
* PFN 0 == physical address 0.
@@ -208,7 +207,7 @@ static inline unsigned long __phys_to_virt(unsigned long x)
* direct-mapped view. We assume this is the first page
* of RAM in the mem_map as well.
*/
#define PHYS_PFN_OFFSET (PHYS_OFFSET >> PAGE_SHIFT)
#define PHYS_PFN_OFFSET ((unsigned long)(PHYS_OFFSET >> PAGE_SHIFT))
/*
* These are *only* valid on the kernel direct mapped RAM memory.
@@ -291,7 +290,8 @@ static inline __deprecated void *bus_to_virt(unsigned long x)
#define ARCH_PFN_OFFSET PHYS_PFN_OFFSET
#define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
#define virt_addr_valid(kaddr) ((unsigned long)(kaddr) >= PAGE_OFFSET && (unsigned long)(kaddr) < (unsigned long)high_memory)
#define virt_addr_valid(kaddr) (((unsigned long)(kaddr) >= PAGE_OFFSET && (unsigned long)(kaddr) < (unsigned long)high_memory) \
&& pfn_valid(__pa(kaddr) >> PAGE_SHIFT) )
#endif

View File

@@ -12,6 +12,8 @@ enum {
ARM_SEC_CORE,
ARM_SEC_EXIT,
ARM_SEC_DEVEXIT,
ARM_SEC_HOT,
ARM_SEC_UNLIKELY,
ARM_SEC_MAX,
};

View File

@@ -0,0 +1,36 @@
/*
* linux/arch/arm/include/asm/neon.h
*
* Copyright (C) 2013 Linaro Ltd <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <asm/hwcap.h>
#define cpu_has_neon() (!!(elf_hwcap & HWCAP_NEON))
#ifdef __ARM_NEON__
/*
* If you are affected by the BUILD_BUG below, it probably means that you are
* using NEON code /and/ calling the kernel_neon_begin() function from the same
* compilation unit. To prevent issues that may arise from GCC reordering or
* generating(1) NEON instructions outside of these begin/end functions, the
* only supported way of using NEON code in the kernel is by isolating it in a
* separate compilation unit, and calling it from another unit from inside a
* kernel_neon_begin/kernel_neon_end pair.
*
* (1) Current GCC (4.7) might generate NEON instructions at O3 level if
* -mpfu=neon is set.
*/
#define kernel_neon_begin() \
BUILD_BUG_ON_MSG(1, "kernel_neon_begin() called from NEON code")
#else
void kernel_neon_begin(void);
#endif
void kernel_neon_end(void);

View File

@@ -13,7 +13,7 @@
/* PAGE_SHIFT determines the page size */
#define PAGE_SHIFT 12
#define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT)
#define PAGE_MASK (~(PAGE_SIZE-1))
#define PAGE_MASK (~((1 << PAGE_SHIFT) - 1))
#ifndef __ASSEMBLY__

View File

@@ -72,6 +72,7 @@
#define PTE_TABLE_BIT (_AT(pteval_t, 1) << 1)
#define PTE_BUFFERABLE (_AT(pteval_t, 1) << 2) /* AttrIndx[0] */
#define PTE_CACHEABLE (_AT(pteval_t, 1) << 3) /* AttrIndx[1] */
#define PTE_AP2 (_AT(pteval_t, 1) << 7) /* AP[2] */
#define PTE_EXT_SHARED (_AT(pteval_t, 3) << 8) /* SH[1:0], inner shareable */
#define PTE_EXT_AF (_AT(pteval_t, 1) << 10) /* Access Flag */
#define PTE_EXT_NG (_AT(pteval_t, 1) << 11) /* nG */

View File

@@ -33,7 +33,7 @@
#define PTRS_PER_PMD 512
#define PTRS_PER_PGD 4
#define PTE_HWTABLE_PTRS (PTRS_PER_PTE)
#define PTE_HWTABLE_PTRS (0)
#define PTE_HWTABLE_OFF (0)
#define PTE_HWTABLE_SIZE (PTRS_PER_PTE * sizeof(u64))
@@ -48,16 +48,16 @@
#define PMD_SHIFT 21
#define PMD_SIZE (1UL << PMD_SHIFT)
#define PMD_MASK (~(PMD_SIZE-1))
#define PMD_MASK (~((1 << PMD_SHIFT) - 1))
#define PGDIR_SIZE (1UL << PGDIR_SHIFT)
#define PGDIR_MASK (~(PGDIR_SIZE-1))
#define PGDIR_MASK (~((1 << PGDIR_SHIFT) - 1))
/*
* section address mask and size definitions.
*/
#define SECTION_SHIFT 21
#define SECTION_SIZE (1UL << SECTION_SHIFT)
#define SECTION_MASK (~(SECTION_SIZE-1))
#define SECTION_MASK (~((1 << SECTION_SHIFT) - 1))
#define USER_PTRS_PER_PGD (PAGE_OFFSET / PGDIR_SIZE)
@@ -79,13 +79,13 @@
#define L_PTE_PRESENT (_AT(pteval_t, 3) << 0) /* Present */
#define L_PTE_FILE (_AT(pteval_t, 1) << 2) /* only when !PRESENT */
#define L_PTE_USER (_AT(pteval_t, 1) << 6) /* AP[1] */
#define L_PTE_RDONLY (_AT(pteval_t, 1) << 7) /* AP[2] */
#define L_PTE_SHARED (_AT(pteval_t, 3) << 8) /* SH[1:0], inner shareable */
#define L_PTE_YOUNG (_AT(pteval_t, 1) << 10) /* AF */
#define L_PTE_XN (_AT(pteval_t, 1) << 54) /* XN */
#define L_PTE_DIRTY (_AT(pteval_t, 1) << 55) /* unused */
#define L_PTE_SPECIAL (_AT(pteval_t, 1) << 56) /* unused */
#define L_PTE_DIRTY (_AT(pteval_t, 1) << 55)
#define L_PTE_SPECIAL (_AT(pteval_t, 1) << 56)
#define L_PTE_NONE (_AT(pteval_t, 1) << 57) /* PROT_NONE */
#define L_PTE_RDONLY (_AT(pteval_t, 1) << 58) /* READ ONLY */
#define PMD_SECT_VALID (_AT(pmdval_t, 1) << 0)
#define PMD_SECT_DIRTY (_AT(pmdval_t, 1) << 55)

View File

@@ -214,12 +214,16 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
#define pte_clear(mm,addr,ptep) set_pte_ext(ptep, __pte(0), 0)
#define pte_isset(pte, val) ((u32)(val) == (val) ? pte_val(pte) & (val) \
: !!(pte_val(pte) & (val)))
#define pte_isclear(pte, val) (!(pte_val(pte) & (val)))
#define pte_none(pte) (!pte_val(pte))
#define pte_present(pte) (pte_val(pte) & L_PTE_PRESENT)
#define pte_write(pte) (!(pte_val(pte) & L_PTE_RDONLY))
#define pte_dirty(pte) (pte_val(pte) & L_PTE_DIRTY)
#define pte_young(pte) (pte_val(pte) & L_PTE_YOUNG)
#define pte_exec(pte) (!(pte_val(pte) & L_PTE_XN))
#define pte_present(pte) (pte_isset((pte), L_PTE_PRESENT))
#define pte_write(pte) (pte_isclear((pte), L_PTE_RDONLY))
#define pte_dirty(pte) (pte_isset((pte), L_PTE_DIRTY))
#define pte_young(pte) (pte_isset((pte), L_PTE_YOUNG))
#define pte_exec(pte) (pte_isclear((pte), L_PTE_XN))
#define pte_special(pte) (0)
#define pte_present_user(pte) (pte_present(pte) && (pte_val(pte) & L_PTE_USER))

View File

@@ -110,7 +110,7 @@ ENTRY(stext)
sub r4, r3, r4 @ (PHYS_OFFSET - PAGE_OFFSET)
add r8, r8, r4 @ PHYS_OFFSET
#else
ldr r8, =PHYS_OFFSET @ always constant in this case
ldr r8, =PLAT_PHYS_OFFSET @ always constant in this case
#endif
/*

View File

@@ -307,6 +307,10 @@ int module_finalize(const Elf32_Ehdr *hdr, const Elf_Shdr *sechdrs,
maps[ARM_SEC_EXIT].unw_sec = s;
else if (strcmp(".ARM.exidx.devexit.text", secname) == 0)
maps[ARM_SEC_DEVEXIT].unw_sec = s;
else if (strcmp(".ARM.exidx.text.unlikely", secname) == 0)
maps[ARM_SEC_UNLIKELY].unw_sec = s;
else if (strcmp(".ARM.exidx.text.hot", secname) == 0)
maps[ARM_SEC_HOT].unw_sec = s;
else if (strcmp(".init.text", secname) == 0)
maps[ARM_SEC_INIT].txt_sec = s;
else if (strcmp(".devinit.text", secname) == 0)
@@ -317,6 +321,10 @@ int module_finalize(const Elf32_Ehdr *hdr, const Elf_Shdr *sechdrs,
maps[ARM_SEC_EXIT].txt_sec = s;
else if (strcmp(".devexit.text", secname) == 0)
maps[ARM_SEC_DEVEXIT].txt_sec = s;
else if (strcmp(".text.unlikely", secname) == 0)
maps[ARM_SEC_UNLIKELY].txt_sec = s;
else if (strcmp(".text.hot", secname) == 0)
maps[ARM_SEC_HOT].txt_sec = s;
}
for (i = 0; i < ARM_SEC_MAX; i++)

View File

@@ -301,8 +301,8 @@ int __init mx6q_clocks_init(void)
post_div_table[1].div = 1;
post_div_table[2].div = 1;
video_div_table[1].div = 1;
video_div_table[2].div = 1;
};
video_div_table[3].div = 1;
}
/* type name parent_name base div_mask */
clk[pll1_sys] = imx_clk_pllv3(IMX_PLLV3_SYS, "pll1_sys", "osc", base, 0x7f);

View File

@@ -503,11 +503,11 @@ static void __init realtime_counter_init(void)
rate = clk_get_rate(sys_clk);
/* Numerator/denumerator values refer TRM Realtime Counter section */
switch (rate) {
case 1200000:
case 12000000:
num = 64;
den = 125;
break;
case 1300000:
case 13000000:
num = 768;
den = 1625;
break;
@@ -515,11 +515,11 @@ static void __init realtime_counter_init(void)
num = 8;
den = 25;
break;
case 2600000:
case 26000000:
num = 384;
den = 1625;
break;
case 2700000:
case 27000000:
num = 256;
den = 1125;
break;

View File

@@ -814,6 +814,7 @@ static struct platform_device ipmmu_device = {
static struct renesas_intc_irqpin_config irqpin0_platform_data = {
.irq_base = irq_pin(0), /* IRQ0 -> IRQ7 */
.control_parent = true,
};
static struct resource irqpin0_resources[] = {
@@ -875,6 +876,7 @@ static struct platform_device irqpin1_device = {
static struct renesas_intc_irqpin_config irqpin2_platform_data = {
.irq_base = irq_pin(16), /* IRQ16 -> IRQ23 */
.control_parent = true,
};
static struct resource irqpin2_resources[] = {
@@ -905,6 +907,7 @@ static struct platform_device irqpin2_device = {
static struct renesas_intc_irqpin_config irqpin3_platform_data = {
.irq_base = irq_pin(24), /* IRQ24 -> IRQ31 */
.control_parent = true,
};
static struct resource irqpin3_resources[] = {

View File

@@ -417,12 +417,21 @@ void __init dma_contiguous_remap(void)
map.type = MT_MEMORY_DMA_READY;
/*
* Clear previous low-memory mapping
* Clear previous low-memory mapping to ensure that the
* TLB does not see any conflicting entries, then flush
* the TLB of the old entries before creating new mappings.
*
* This ensures that any speculatively loaded TLB entries
* (even though they may be rare) can not cause any problems,
* and ensures that this code is architecturally compliant.
*/
for (addr = __phys_to_virt(start); addr < __phys_to_virt(end);
addr += PMD_SIZE)
pmd_clear(pmd_off_k(addr));
flush_tlb_kernel_range(__phys_to_virt(start),
__phys_to_virt(end));
iotable_init(&map, 1);
}
}

View File

@@ -707,8 +707,9 @@ static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
}
static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
unsigned long end, unsigned long phys, const struct mem_type *type,
bool force_pages)
unsigned long end, phys_addr_t phys,
const struct mem_type *type,
bool force_pages)
{
pud_t *pud = pud_offset(pgd, addr);
unsigned long next;

View File

@@ -78,8 +78,13 @@ ENTRY(cpu_v7_set_pte_ext)
tst rh, #1 << (57 - 32) @ L_PTE_NONE
bicne rl, #L_PTE_VALID
bne 1f
tst rh, #1 << (55 - 32) @ L_PTE_DIRTY
orreq rl, #L_PTE_RDONLY
eor ip, rh, #1 << (55 - 32) @ toggle L_PTE_DIRTY in temp reg to
@ test for !L_PTE_DIRTY || L_PTE_RDONLY
tst ip, #1 << (55 - 32) | 1 << (58 - 32)
orrne rl, #PTE_AP2
biceq rl, #PTE_AP2
1: strd r2, r3, [r0]
ALT_SMP(W(nop))
ALT_UP (mcr p15, 0, r0, c7, c10, 1) @ flush_pte

View File

@@ -20,6 +20,7 @@
#include <linux/init.h>
#include <linux/uaccess.h>
#include <linux/user.h>
#include <linux/export.h>
#include <asm/cp15.h>
#include <asm/cputype.h>
@@ -648,6 +649,52 @@ static int vfp_hotplug(struct notifier_block *b, unsigned long action,
return NOTIFY_OK;
}
#ifdef CONFIG_KERNEL_MODE_NEON
/*
* Kernel-side NEON support functions
*/
void kernel_neon_begin(void)
{
struct thread_info *thread = current_thread_info();
unsigned int cpu;
u32 fpexc;
/*
* Kernel mode NEON is only allowed outside of interrupt context
* with preemption disabled. This will make sure that the kernel
* mode NEON register contents never need to be preserved.
*/
BUG_ON(in_interrupt());
cpu = get_cpu();
fpexc = fmrx(FPEXC) | FPEXC_EN;
fmxr(FPEXC, fpexc);
/*
* Save the userland NEON/VFP state. Under UP,
* the owner could be a task other than 'current'
*/
if (vfp_state_in_hw(cpu, thread))
vfp_save_state(&thread->vfpstate, fpexc);
#ifndef CONFIG_SMP
else if (vfp_current_hw_state[cpu] != NULL)
vfp_save_state(vfp_current_hw_state[cpu], fpexc);
#endif
vfp_current_hw_state[cpu] = NULL;
}
EXPORT_SYMBOL(kernel_neon_begin);
void kernel_neon_end(void)
{
/* Disable the NEON/VFP unit. */
fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
put_cpu();
}
EXPORT_SYMBOL(kernel_neon_end);
#endif /* CONFIG_KERNEL_MODE_NEON */
/*
* VFP support code initialisation.
*/
@@ -731,4 +778,4 @@ static int __init vfp_init(void)
return 0;
}
late_initcall(vfp_init);
core_initcall(vfp_init);

View File

@@ -154,4 +154,5 @@ module_exit(sha1_powerpc_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA1 Secure Hash Algorithm");
MODULE_ALIAS("sha1-powerpc");
MODULE_ALIAS_CRYPTO("sha1");
MODULE_ALIAS_CRYPTO("sha1-powerpc");

View File

@@ -757,77 +757,7 @@ struct device_node *of_find_next_cache_node(struct device_node *np)
return NULL;
}
#ifdef CONFIG_PPC_PSERIES
/*
* Fix up the uninitialized fields in a new device node:
* name, type and pci-specific fields
*/
static int of_finish_dynamic_node(struct device_node *node)
{
struct device_node *parent = of_get_parent(node);
int err = 0;
const phandle *ibm_phandle;
node->name = of_get_property(node, "name", NULL);
node->type = of_get_property(node, "device_type", NULL);
if (!node->name)
node->name = "<NULL>";
if (!node->type)
node->type = "<NULL>";
if (!parent) {
err = -ENODEV;
goto out;
}
/* We don't support that function on PowerMac, at least
* not yet
*/
if (machine_is(powermac))
return -ENODEV;
/* fix up new node's phandle field */
if ((ibm_phandle = of_get_property(node, "ibm,phandle", NULL)))
node->phandle = *ibm_phandle;
out:
of_node_put(parent);
return err;
}
static int prom_reconfig_notifier(struct notifier_block *nb,
unsigned long action, void *node)
{
int err;
switch (action) {
case OF_RECONFIG_ATTACH_NODE:
err = of_finish_dynamic_node(node);
if (err < 0)
printk(KERN_ERR "finish_node returned %d\n", err);
break;
default:
err = 0;
break;
}
return notifier_from_errno(err);
}
static struct notifier_block prom_reconfig_nb = {
.notifier_call = prom_reconfig_notifier,
.priority = 10, /* This one needs to run first */
};
static int __init prom_reconfig_setup(void)
{
return of_reconfig_notifier_register(&prom_reconfig_nb);
}
__initcall(prom_reconfig_setup);
#endif
bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
struct device_node *of_get_cpu_node(int cpu, unsigned int *thread)
{
return (int)phys_id == get_hard_smp_processor_id(cpu);
}

View File

@@ -1633,12 +1633,11 @@ static void stage_topology_update(int core_id)
static int dt_update_callback(struct notifier_block *nb,
unsigned long action, void *data)
{
struct of_prop_reconfig *update;
struct of_reconfig_data *update = data;
int rc = NOTIFY_DONE;
switch (action) {
case OF_RECONFIG_UPDATE_PROPERTY:
update = (struct of_prop_reconfig *)data;
if (!of_prop_cmp(update->dn->type, "cpu") &&
!of_prop_cmp(update->prop->name, "ibm,associativity")) {
u32 core_id;

View File

@@ -11,7 +11,6 @@
*/
#include <linux/kernel.h>
#include <linux/kref.h>
#include <linux/notifier.h>
#include <linux/spinlock.h>
#include <linux/cpu.h>
@@ -83,6 +82,8 @@ static struct device_node *dlpar_parse_cc_node(struct cc_workarea *ccwa)
return NULL;
}
of_node_set_flag(dn, OF_DYNAMIC);
return dn;
}

View File

@@ -336,16 +336,17 @@ static void pseries_remove_processor(struct device_node *np)
}
static int pseries_smp_notifier(struct notifier_block *nb,
unsigned long action, void *node)
unsigned long action, void *data)
{
struct of_reconfig_data *rd = data;
int err = 0;
switch (action) {
case OF_RECONFIG_ATTACH_NODE:
err = pseries_add_processor(node);
err = pseries_add_processor(rd->dn);
break;
case OF_RECONFIG_DETACH_NODE:
pseries_remove_processor(node);
pseries_remove_processor(rd->dn);
break;
}
return notifier_from_errno(err);

View File

@@ -198,7 +198,7 @@ static int pseries_add_memory(struct device_node *np)
return (ret < 0) ? -EINVAL : 0;
}
static int pseries_update_drconf_memory(struct of_prop_reconfig *pr)
static int pseries_update_drconf_memory(struct of_reconfig_data *pr)
{
struct of_drconf_cell *new_drmem, *old_drmem;
unsigned long memblock_size;
@@ -210,7 +210,7 @@ static int pseries_update_drconf_memory(struct of_prop_reconfig *pr)
if (!memblock_size)
return -EINVAL;
p = (u32 *)of_get_property(pr->dn, "ibm,dynamic-memory", NULL);
p = (u32 *) pr->old_prop->value;
if (!p)
return -EINVAL;
@@ -245,9 +245,9 @@ static int pseries_update_drconf_memory(struct of_prop_reconfig *pr)
}
static int pseries_memory_notifier(struct notifier_block *nb,
unsigned long action, void *node)
unsigned long action, void *data)
{
struct of_prop_reconfig *pr;
struct of_reconfig_data *rd = data;
int err = 0;
switch (action) {
@@ -258,9 +258,8 @@ static int pseries_memory_notifier(struct notifier_block *nb,
err = pseries_remove_memory(node);
break;
case OF_RECONFIG_UPDATE_PROPERTY:
pr = (struct of_prop_reconfig *)node;
if (!strcmp(pr->prop->name, "ibm,dynamic-memory"))
err = pseries_update_drconf_memory(pr);
if (!strcmp(rd->prop->name, "ibm,dynamic-memory"))
err = pseries_update_drconf_memory(rd);
break;
}
return notifier_from_errno(err);

View File

@@ -1325,10 +1325,11 @@ static struct notifier_block iommu_mem_nb = {
.notifier_call = iommu_mem_notifier,
};
static int iommu_reconfig_notifier(struct notifier_block *nb, unsigned long action, void *node)
static int iommu_reconfig_notifier(struct notifier_block *nb, unsigned long action, void *data)
{
int err = NOTIFY_OK;
struct device_node *np = node;
struct of_reconfig_data *rd = data;
struct device_node *np = rd->dn;
struct pci_dn *pci = PCI_DN(np);
struct direct_window *window;

View File

@@ -12,7 +12,6 @@
*/
#include <linux/kernel.h>
#include <linux/kref.h>
#include <linux/notifier.h>
#include <linux/proc_fs.h>
#include <linux/slab.h>
@@ -70,7 +69,6 @@ static int pSeries_reconfig_add_node(const char *path, struct property *proplist
np->properties = proplist;
of_node_set_flag(np, OF_DYNAMIC);
kref_init(&np->kref);
np->parent = derive_parent(path);
if (IS_ERR(np->parent)) {

View File

@@ -253,9 +253,10 @@ static void __init pseries_discover_pic(void)
" interrupt-controller\n");
}
static int pci_dn_reconfig_notifier(struct notifier_block *nb, unsigned long action, void *node)
static int pci_dn_reconfig_notifier(struct notifier_block *nb, unsigned long action, void *data)
{
struct device_node *np = node;
struct of_reconfig_data *rd = data;
struct device_node *np = rd->dn;
struct pci_dn *pci = NULL;
int err = NOTIFY_OK;

View File

@@ -202,7 +202,7 @@ void __init test_of_node(void)
/* There should really be a struct device_node allocator */
memset(&of_node, 0, sizeof(of_node));
kref_init(&of_node.kref);
kref_init(&of_node.kobj.kref);
of_node.full_name = node_name;
check(0 == msi_bitmap_alloc(&bmp, size, &of_node));

View File

@@ -288,6 +288,7 @@ static inline void disable_surveillance(void)
args.token = rtas_token("set-indicator");
if (args.token == RTAS_UNKNOWN_SERVICE)
return;
args.token = cpu_to_be32(args.token);
args.nargs = cpu_to_be32(3);
args.nret = cpu_to_be32(1);
args.rets = &args.args[3];

View File

@@ -970,7 +970,7 @@ static void __exit aes_s390_fini(void)
module_init(aes_s390_init);
module_exit(aes_s390_fini);
MODULE_ALIAS("aes-all");
MODULE_ALIAS_CRYPTO("aes-all");
MODULE_DESCRIPTION("Rijndael (AES) Cipher Algorithm");
MODULE_LICENSE("GPL");

View File

@@ -619,8 +619,8 @@ static void __exit des_s390_exit(void)
module_init(des_s390_init);
module_exit(des_s390_exit);
MODULE_ALIAS("des");
MODULE_ALIAS("des3_ede");
MODULE_ALIAS_CRYPTO("des");
MODULE_ALIAS_CRYPTO("des3_ede");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("DES & Triple DES EDE Cipher Algorithms");

View File

@@ -160,7 +160,7 @@ static void __exit ghash_mod_exit(void)
module_init(ghash_mod_init);
module_exit(ghash_mod_exit);
MODULE_ALIAS("ghash");
MODULE_ALIAS_CRYPTO("ghash");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("GHASH Message Digest Algorithm, s390 implementation");

View File

@@ -103,6 +103,6 @@ static void __exit sha1_s390_fini(void)
module_init(sha1_s390_init);
module_exit(sha1_s390_fini);
MODULE_ALIAS("sha1");
MODULE_ALIAS_CRYPTO("sha1");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA1 Secure Hash Algorithm");

View File

@@ -143,7 +143,7 @@ static void __exit sha256_s390_fini(void)
module_init(sha256_s390_init);
module_exit(sha256_s390_fini);
MODULE_ALIAS("sha256");
MODULE_ALIAS("sha224");
MODULE_ALIAS_CRYPTO("sha256");
MODULE_ALIAS_CRYPTO("sha224");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA256 and SHA224 Secure Hash Algorithm");

View File

@@ -86,7 +86,7 @@ static struct shash_alg sha512_alg = {
}
};
MODULE_ALIAS("sha512");
MODULE_ALIAS_CRYPTO("sha512");
static int sha384_init(struct shash_desc *desc)
{
@@ -126,7 +126,7 @@ static struct shash_alg sha384_alg = {
}
};
MODULE_ALIAS("sha384");
MODULE_ALIAS_CRYPTO("sha384");
static int __init init(void)
{

View File

@@ -499,6 +499,6 @@ module_exit(aes_sparc64_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("AES Secure Hash Algorithm, sparc64 aes opcode accelerated");
MODULE_ALIAS("aes");
MODULE_ALIAS_CRYPTO("aes");
#include "crop_devid.c"

View File

@@ -322,6 +322,6 @@ module_exit(camellia_sparc64_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Camellia Cipher Algorithm, sparc64 camellia opcode accelerated");
MODULE_ALIAS("aes");
MODULE_ALIAS_CRYPTO("aes");
#include "crop_devid.c"

View File

@@ -176,6 +176,6 @@ module_exit(crc32c_sparc64_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("CRC32c (Castagnoli), sparc64 crc32c opcode accelerated");
MODULE_ALIAS("crc32c");
MODULE_ALIAS_CRYPTO("crc32c");
#include "crop_devid.c"

View File

@@ -532,6 +532,6 @@ module_exit(des_sparc64_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("DES & Triple DES EDE Cipher Algorithms, sparc64 des opcode accelerated");
MODULE_ALIAS("des");
MODULE_ALIAS_CRYPTO("des");
#include "crop_devid.c"

View File

@@ -185,6 +185,6 @@ module_exit(md5_sparc64_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("MD5 Secure Hash Algorithm, sparc64 md5 opcode accelerated");
MODULE_ALIAS("md5");
MODULE_ALIAS_CRYPTO("md5");
#include "crop_devid.c"

View File

@@ -180,6 +180,6 @@ module_exit(sha1_sparc64_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA1 Secure Hash Algorithm, sparc64 sha1 opcode accelerated");
MODULE_ALIAS("sha1");
MODULE_ALIAS_CRYPTO("sha1");
#include "crop_devid.c"

View File

@@ -237,7 +237,7 @@ module_exit(sha256_sparc64_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA-224 and SHA-256 Secure Hash Algorithm, sparc64 sha256 opcode accelerated");
MODULE_ALIAS("sha224");
MODULE_ALIAS("sha256");
MODULE_ALIAS_CRYPTO("sha224");
MODULE_ALIAS_CRYPTO("sha256");
#include "crop_devid.c"

View File

@@ -222,7 +222,7 @@ module_exit(sha512_sparc64_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA-384 and SHA-512 Secure Hash Algorithm, sparc64 sha512 opcode accelerated");
MODULE_ALIAS("sha384");
MODULE_ALIAS("sha512");
MODULE_ALIAS_CRYPTO("sha384");
MODULE_ALIAS_CRYPTO("sha512");
#include "crop_devid.c"

View File

@@ -8,6 +8,7 @@ config UML
default y
select HAVE_GENERIC_HARDIRQS
select HAVE_UID16
select HAVE_FUTEX_CMPXCHG if FUTEX
select GENERIC_IRQ_SHOW
select GENERIC_CPU_DEVICES
select GENERIC_IO

View File

@@ -66,5 +66,5 @@ module_exit(aes_fini);
MODULE_DESCRIPTION("Rijndael (AES) Cipher Algorithm, asm optimized");
MODULE_LICENSE("GPL");
MODULE_ALIAS("aes");
MODULE_ALIAS("aes-asm");
MODULE_ALIAS_CRYPTO("aes");
MODULE_ALIAS_CRYPTO("aes-asm");

View File

@@ -1373,4 +1373,4 @@ module_exit(aesni_exit);
MODULE_DESCRIPTION("Rijndael (AES) Cipher Algorithm, Intel AES-NI instructions optimized");
MODULE_LICENSE("GPL");
MODULE_ALIAS("aes");
MODULE_ALIAS_CRYPTO("aes");

View File

@@ -581,5 +581,5 @@ module_exit(fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Blowfish Cipher Algorithm, AVX2 optimized");
MODULE_ALIAS("blowfish");
MODULE_ALIAS("blowfish-asm");
MODULE_ALIAS_CRYPTO("blowfish");
MODULE_ALIAS_CRYPTO("blowfish-asm");

View File

@@ -465,5 +465,5 @@ module_exit(fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Blowfish Cipher Algorithm, asm optimized");
MODULE_ALIAS("blowfish");
MODULE_ALIAS("blowfish-asm");
MODULE_ALIAS_CRYPTO("blowfish");
MODULE_ALIAS_CRYPTO("blowfish-asm");

View File

@@ -582,5 +582,5 @@ module_exit(camellia_aesni_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Camellia Cipher Algorithm, AES-NI/AVX2 optimized");
MODULE_ALIAS("camellia");
MODULE_ALIAS("camellia-asm");
MODULE_ALIAS_CRYPTO("camellia");
MODULE_ALIAS_CRYPTO("camellia-asm");

View File

@@ -574,5 +574,5 @@ module_exit(camellia_aesni_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Camellia Cipher Algorithm, AES-NI/AVX optimized");
MODULE_ALIAS("camellia");
MODULE_ALIAS("camellia-asm");
MODULE_ALIAS_CRYPTO("camellia");
MODULE_ALIAS_CRYPTO("camellia-asm");

View File

@@ -1725,5 +1725,5 @@ module_exit(fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Camellia Cipher Algorithm, asm optimized");
MODULE_ALIAS("camellia");
MODULE_ALIAS("camellia-asm");
MODULE_ALIAS_CRYPTO("camellia");
MODULE_ALIAS_CRYPTO("camellia-asm");

View File

@@ -494,4 +494,4 @@ module_exit(cast5_exit);
MODULE_DESCRIPTION("Cast5 Cipher Algorithm, AVX optimized");
MODULE_LICENSE("GPL");
MODULE_ALIAS("cast5");
MODULE_ALIAS_CRYPTO("cast5");

View File

@@ -611,4 +611,4 @@ module_exit(cast6_exit);
MODULE_DESCRIPTION("Cast6 Cipher Algorithm, AVX optimized");
MODULE_LICENSE("GPL");
MODULE_ALIAS("cast6");
MODULE_ALIAS_CRYPTO("cast6");

View File

@@ -197,5 +197,5 @@ module_exit(crc32_pclmul_mod_fini);
MODULE_AUTHOR("Alexander Boyko <alexander_boyko@xyratex.com>");
MODULE_LICENSE("GPL");
MODULE_ALIAS("crc32");
MODULE_ALIAS("crc32-pclmul");
MODULE_ALIAS_CRYPTO("crc32");
MODULE_ALIAS_CRYPTO("crc32-pclmul");

View File

@@ -280,5 +280,5 @@ MODULE_AUTHOR("Austin Zhang <austin.zhang@intel.com>, Kent Liu <kent.liu@intel.c
MODULE_DESCRIPTION("CRC32c (Castagnoli) optimization using Intel Hardware.");
MODULE_LICENSE("GPL");
MODULE_ALIAS("crc32c");
MODULE_ALIAS("crc32c-intel");
MODULE_ALIAS_CRYPTO("crc32c");
MODULE_ALIAS_CRYPTO("crc32c-intel");

View File

@@ -17,6 +17,7 @@
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/crypto.h>
#include <asm/i387.h>
struct crypto_fpu_ctx {
@@ -159,3 +160,5 @@ void __exit crypto_fpu_exit(void)
{
crypto_unregister_template(&crypto_fpu_tmpl);
}
MODULE_ALIAS_CRYPTO("fpu");

View File

@@ -341,4 +341,4 @@ module_exit(ghash_pclmulqdqni_mod_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("GHASH Message Digest Algorithm, "
"acclerated by PCLMULQDQ-NI");
MODULE_ALIAS("ghash");
MODULE_ALIAS_CRYPTO("ghash");

View File

@@ -119,5 +119,5 @@ module_exit(fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION ("Salsa20 stream cipher algorithm (optimized assembly version)");
MODULE_ALIAS("salsa20");
MODULE_ALIAS("salsa20-asm");
MODULE_ALIAS_CRYPTO("salsa20");
MODULE_ALIAS_CRYPTO("salsa20-asm");

View File

@@ -558,5 +558,5 @@ module_exit(fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Serpent Cipher Algorithm, AVX2 optimized");
MODULE_ALIAS("serpent");
MODULE_ALIAS("serpent-asm");
MODULE_ALIAS_CRYPTO("serpent");
MODULE_ALIAS_CRYPTO("serpent-asm");

View File

@@ -617,4 +617,4 @@ module_exit(serpent_exit);
MODULE_DESCRIPTION("Serpent Cipher Algorithm, AVX optimized");
MODULE_LICENSE("GPL");
MODULE_ALIAS("serpent");
MODULE_ALIAS_CRYPTO("serpent");

View File

@@ -618,4 +618,4 @@ module_exit(serpent_sse2_exit);
MODULE_DESCRIPTION("Serpent Cipher Algorithm, SSE2 optimized");
MODULE_LICENSE("GPL");
MODULE_ALIAS("serpent");
MODULE_ALIAS_CRYPTO("serpent");

View File

@@ -237,4 +237,4 @@ module_exit(sha1_ssse3_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA1 Secure Hash Algorithm, Supplemental SSE3 accelerated");
MODULE_ALIAS("sha1");
MODULE_ALIAS_CRYPTO("sha1");

View File

@@ -272,4 +272,4 @@ module_exit(sha256_ssse3_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA256 Secure Hash Algorithm, Supplemental SSE3 accelerated");
MODULE_ALIAS("sha256");
MODULE_ALIAS_CRYPTO("sha256");

View File

@@ -279,4 +279,4 @@ module_exit(sha512_ssse3_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA512 Secure Hash Algorithm, Supplemental SSE3 accelerated");
MODULE_ALIAS("sha512");
MODULE_ALIAS_CRYPTO("sha512");

View File

@@ -580,5 +580,5 @@ module_exit(fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Twofish Cipher Algorithm, AVX2 optimized");
MODULE_ALIAS("twofish");
MODULE_ALIAS("twofish-asm");
MODULE_ALIAS_CRYPTO("twofish");
MODULE_ALIAS_CRYPTO("twofish-asm");

View File

@@ -589,4 +589,4 @@ module_exit(twofish_exit);
MODULE_DESCRIPTION("Twofish Cipher Algorithm, AVX optimized");
MODULE_LICENSE("GPL");
MODULE_ALIAS("twofish");
MODULE_ALIAS_CRYPTO("twofish");

View File

@@ -96,5 +96,5 @@ module_exit(fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION ("Twofish Cipher Algorithm, asm optimized");
MODULE_ALIAS("twofish");
MODULE_ALIAS("twofish-asm");
MODULE_ALIAS_CRYPTO("twofish");
MODULE_ALIAS_CRYPTO("twofish-asm");

View File

@@ -495,5 +495,5 @@ module_exit(fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Twofish Cipher Algorithm, 3-way parallel asm optimized");
MODULE_ALIAS("twofish");
MODULE_ALIAS("twofish-asm");
MODULE_ALIAS_CRYPTO("twofish");
MODULE_ALIAS_CRYPTO("twofish-asm");

View File

@@ -251,7 +251,8 @@ static inline void native_load_tls(struct thread_struct *t, unsigned int cpu)
gdt[GDT_ENTRY_TLS_MIN + i] = t->tls_array[i];
}
#define _LDT_empty(info) \
/* This intentionally ignores lm, since 32-bit apps don't have that field. */
#define LDT_empty(info) \
((info)->base_addr == 0 && \
(info)->limit == 0 && \
(info)->contents == 0 && \
@@ -261,11 +262,18 @@ static inline void native_load_tls(struct thread_struct *t, unsigned int cpu)
(info)->seg_not_present == 1 && \
(info)->useable == 0)
#ifdef CONFIG_X86_64
#define LDT_empty(info) (_LDT_empty(info) && ((info)->lm == 0))
#else
#define LDT_empty(info) (_LDT_empty(info))
#endif
/* Lots of programs expect an all-zero user_desc to mean "no segment at all". */
static inline bool LDT_zero(const struct user_desc *info)
{
return (info->base_addr == 0 &&
info->limit == 0 &&
info->contents == 0 &&
info->read_exec_only == 0 &&
info->seg_32bit == 0 &&
info->limit_in_pages == 0 &&
info->seg_not_present == 0 &&
info->useable == 0);
}
static inline void clear_LDT(void)
{

View File

@@ -60,6 +60,7 @@ static struct clocksource hyperv_cs = {
.rating = 400, /* use this when running on Hyperv*/
.read = read_hv_clock,
.mask = CLOCKSOURCE_MASK(64),
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
};
static void __init ms_hyperv_init_platform(void)

View File

@@ -1017,6 +1017,15 @@ int __kprobes setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
regs->flags &= ~X86_EFLAGS_IF;
trace_hardirqs_off();
regs->ip = (unsigned long)(jp->entry);
/*
* jprobes use jprobe_return() which skips the normal return
* path of the function, and this messes up the accounting of the
* function graph tracer to get messed up.
*
* Pause function graph tracing while performing the jprobe function.
*/
pause_graph_tracing();
return 1;
}
@@ -1042,24 +1051,25 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
u8 *addr = (u8 *) (regs->ip - 1);
struct jprobe *jp = container_of(p, struct jprobe, kp);
void *saved_sp = kcb->jprobe_saved_sp;
if ((addr > (u8 *) jprobe_return) &&
(addr < (u8 *) jprobe_return_end)) {
if (stack_addr(regs) != kcb->jprobe_saved_sp) {
if (stack_addr(regs) != saved_sp) {
struct pt_regs *saved_regs = &kcb->jprobe_saved_regs;
printk(KERN_ERR
"current sp %p does not match saved sp %p\n",
stack_addr(regs), kcb->jprobe_saved_sp);
stack_addr(regs), saved_sp);
printk(KERN_ERR "Saved registers for jprobe %p\n", jp);
show_regs(saved_regs);
printk(KERN_ERR "Current registers\n");
show_regs(regs);
BUG();
}
/* It's OK to start function graph tracing again */
unpause_graph_tracing();
*regs = kcb->jprobe_saved_regs;
memcpy((kprobe_opcode_t *)(kcb->jprobe_saved_sp),
kcb->jprobes_stack,
MIN_STACK_SIZE(kcb->jprobe_saved_sp));
memcpy(saved_sp, kcb->jprobes_stack, MIN_STACK_SIZE(saved_sp));
preempt_enable_no_resched();
return 1;
}

View File

@@ -29,7 +29,28 @@ static int get_free_idx(void)
static bool tls_desc_okay(const struct user_desc *info)
{
if (LDT_empty(info))
/*
* For historical reasons (i.e. no one ever documented how any
* of the segmentation APIs work), user programs can and do
* assume that a struct user_desc that's all zeros except for
* entry_number means "no segment at all". This never actually
* worked. In fact, up to Linux 3.19, a struct user_desc like
* this would create a 16-bit read-write segment with base and
* limit both equal to zero.
*
* That was close enough to "no segment at all" until we
* hardened this function to disallow 16-bit TLS segments. Fix
* it up by interpreting these zeroed segments the way that they
* were almost certainly intended to be interpreted.
*
* The correct way to ask for "no segment at all" is to specify
* a user_desc that satisfies LDT_empty. To keep everything
* working, we accept both.
*
* Note that there's a similar kludge in modify_ldt -- look at
* the distinction between modes 1 and 0x11.
*/
if (LDT_empty(info) || LDT_zero(info))
return true;
/*
@@ -71,7 +92,7 @@ static void set_tls_desc(struct task_struct *p, int idx,
cpu = get_cpu();
while (n-- > 0) {
if (LDT_empty(info))
if (LDT_empty(info) || LDT_zero(info))
desc->a = desc->b = 0;
else
fill_ldt(desc, info);

View File

@@ -362,7 +362,7 @@ exit:
* for scheduling or signal handling. The actual stack switch is done in
* entry.S
*/
asmlinkage __kprobes struct pt_regs *sync_regs(struct pt_regs *eregs)
asmlinkage notrace __kprobes struct pt_regs *sync_regs(struct pt_regs *eregs)
{
struct pt_regs *regs = eregs;
/* Did already sync */
@@ -387,7 +387,7 @@ struct bad_iret_stack {
struct pt_regs regs;
};
asmlinkage __visible
asmlinkage __visible notrace __kprobes
struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
{
/*

View File

@@ -380,7 +380,7 @@ static unsigned long quick_pit_calibrate(void)
goto success;
}
}
pr_err("Fast TSC calibration failed\n");
pr_info("Fast TSC calibration failed\n");
return 0;
success:

View File

@@ -34,7 +34,7 @@ typedef asmlinkage void (*sys_call_ptr_t)(void);
extern asmlinkage void sys_ni_syscall(void);
const sys_call_ptr_t sys_call_table[] __cacheline_aligned = {
const sys_call_ptr_t sys_call_table[] ____cacheline_aligned = {
/*
* Smells like a compiler bug -- it doesn't work
* when the & below is removed.

View File

@@ -46,7 +46,7 @@ typedef void (*sys_call_ptr_t)(void);
extern void sys_ni_syscall(void);
const sys_call_ptr_t sys_call_table[] __cacheline_aligned = {
const sys_call_ptr_t sys_call_table[] ____cacheline_aligned = {
/*
* Smells like a compiler bug -- it doesn't work
* when the & below is removed.

View File

@@ -180,3 +180,4 @@ module_exit(nx842_mod_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("842 Compression Algorithm");
MODULE_ALIAS_CRYPTO("842");

Some files were not shown because too many files have changed in this diff Show More