Commit Graph

275188 Commits

Author SHA1 Message Date
Dan Carpenter
edb2b255a0 USB: message: cleanup min_t() cast in usb_sg_init()
"length" is type size_t so the cast to unsigned int truncates the
upper bytes.  This isn't an issue in real life (I've checked the
callers) but it's a bit messy.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-09-29 13:13:07 -07:00
Rigbert Hamisch
1bfac90d1b USB: qcserial: add device ID for "HP un2430 Mobile Broadband Module"
add device ID for "HP un2430 Mobile Broadband Module"

Signed-off-by: Rigbert Hamisch <rigbert@gmx.de>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-09-29 13:12:36 -07:00
Jiri Olsa
87ffef79ab perf symbols: Treat all memory maps without dso file as loaded
The stack/vdso/heap memory maps dont have any dso file.  Setting the
perf dso objects as 'loaded' for these maps, we avoid unnecessary
warnings like:

  "Failed to open [stack], continuing without symbols"

All map__find_* functions still return NULL when searching for symbols
in these maps.

Link: http://lkml.kernel.org/r/20110824131834.GA2007@jolsa.brq.redhat.com
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 17:10:48 -03:00
Jiri Olsa
e78cb3628b perf sched: Fix script command documentation
Fixed leftover from trace -> script rename.

Link: http://lkml.kernel.org/r/1317114995-4534-1-git-send-email-jolsa@redhat.com
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 17:10:34 -03:00
Arnaldo Carvalho de Melo
b53af0f235 perf report: Fix stdio event name header printing
In the past we tried to avoid printing the name of the event when just
one event was found in the perf.data file, after some refactorings it
ended up not printing the event name if just one hist_entry was found in
one of the events.

Fix it by always printing the name of the event, even if just one is
found.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-kikr0c7ou55bd9caok8569rf@git.kernel.org
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 17:10:19 -03:00
Andi Kleen
f69b64f73e perf: Support setting the disassembler style
Add -M option to report/annotate to pass directly to objdump.  This
allows to use -M intel for intel style disassembler syntax, which is
useful for people who are very used to the Intel syntax.

Link: http://lkml.kernel.org/r/1316122302-24306-2-git-send-email-andi@firstfloor.org
[committer note: Add missing Documentation bits, fixup conflicts with 3e6a2a7]
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 17:10:00 -03:00
Andi Kleen
33e49ea70d perf tools: Make stat/record print fatal signals of the target program
When a program crashes under perf there is no message about it, unlike
when running it from bash. This can be confusing and lead to wrong
actions during debugging.

Print fatal signals in perf stat/record.

Thanks to Furat Afram for finding the problem originally

Link: http://lkml.kernel.org/r/1316122302-24306-1-git-send-email-andi@firstfloor.org
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 17:09:46 -03:00
Jim Cromie
61a9f32429 perf stat: Fix spelling in comment
Link: http://lkml.kernel.org/r/1315437244-3788-6-git-send-email-jim.cromie@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 17:09:36 -03:00
Ming Lei
2a5306cc5f PM / Tracing: build rpm-traces.c only if CONFIG_PM_RUNTIME is set
Do not build kernel/trace/rpm-traces.c if CONFIG_PM_RUNTIME is not
set, which avoids a build failure.

[rjw: Added the changelog and modified the subject slightly.]

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-09-29 22:07:23 +02:00
Jim Cromie
d4ffd04df1 perf stat: Allow tab as cvs delimiter
If option -x '\t' is given, convert '\t' to "\t".  This makes cvs
printing more flexible.

Link: http://lkml.kernel.org/r/1315437244-3788-5-git-send-email-jim.cromie@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 17:04:30 -03:00
Jim Cromie
a1bca6cc87 perf stat: Suppress printing std-dev when its 0
For pretty output only (preserve column for cvs output), dont print
std-deviation when its 0.00.  Do this based upon value, instead of
checking for --no-aggr, since the stats could conceivably be computed
over the runs on each CPU, and theres no reason to preclude that.

Link: http://lkml.kernel.org/r/1315437244-3788-4-git-send-email-jim.cromie@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 17:04:00 -03:00
Jim Cromie
19f4740255 perf stat: Fix +- nan% in --no-aggr runs
Without this patch, running:

$ sudo ./perf stat -r20 --no-aggr -a perl -e '$i++ for 1..100000'

I get computations like this:

CPU0             12.488247 task-clock                #    1.224 CPUs utilized            ( +-  -nan% )
CPU1             12.488909 task-clock                #    1.225 CPUs utilized            ( +-  -nan% )
CPU2             12.500221 task-clock                #    1.226 CPUs utilized            ( +-  -nan% )
CPU3             12.481713 task-clock                #    1.224 CPUs utilized            ( +-  -nan% )

but with patch, I get:

CPU0              8.233682 task-clock                #    0.754 CPUs utilized            ( +-  0.00% )
CPU1              8.226318 task-clock                #    0.754 CPUs utilized            ( +-  0.00% )
CPU2              8.210737 task-clock                #    0.752 CPUs utilized            ( +-  0.00% )
CPU3              8.201691 task-clock                #    0.751 CPUs utilized            ( +-  0.00% )

Note that without --no-aggr, I get non-0 statistics both before and after patch:

        231.986022 task-clock                #    4.030 CPUs utilized            ( +-  0.97% )
               212 context-switches          #    0.001 M/sec                    ( +- 12.07% )
                 9 CPU-migrations            #    0.000 M/sec                    ( +- 25.80% )
               466 page-faults               #    0.002 M/sec                    ( +-  3.23% )
       174,318,593 cycles                    #    0.751 GHz                      ( +-  1.06% )

I couldnt see anything wrong in the caller, so fixed it in
stddev_stats().  ISTM that 0.00 is better than nan, since perf stat was
passed -A (--no-aggr) so no standard deviation should be expected, and
nan is suggestive of a deeper error.

When running with --no-aggr, perhaps we should suppress the statistics
printing entirely, or do so when they are 0.00.

Link: http://lkml.kernel.org/r/1315437244-3788-3-git-send-email-jim.cromie@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 17:03:46 -03:00
Jim Cromie
56f3bae706 perf stat: Add --log-fd <N> option to redirect stderr elsewhere
This perf stat option emulates valgrind's --log-fd option, allowing the
user to send perf results elsewhere, and leaving stderr for use by the
program under test.  This complements --output file option, and is
mutually exclusive with it.

   3>results  perf stat --log-fd 3          -- $cmd
   3>>results perf stat --log-fd 3 --append -- $cmd

The perl distro's make test.valgrind target uses valgrind's --log-fd
option, I've adapted it to invoke perf also, and tested this patch
there.

Link: http://lkml.kernel.org/r/1315437244-3788-2-git-send-email-jim.cromie@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 17:03:23 -03:00
Arnaldo Carvalho de Melo
dcc101d1d0 perf top: Improve lost events warning
Now it warns everytime that new events are lost.

And the TUI also warns now.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-w1n168yrvrppnq6887s4u0wx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 16:41:38 -03:00
Arnaldo Carvalho de Melo
eb48900831 perf top browser: Fix up line width calculation
Fixing an artifact where the last 3 chars of a long DSO name would
remain on the screen sometimes.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-dkiakcl3z69dh1bt9uegaktv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 16:41:37 -03:00
Arnaldo Carvalho de Melo
7a6f205d62 perf buildid-list: Support showing the build id in an ELF file
Try first reading the build id, validating that it is an ELF file, etc.
Cheap as libelf will bail out as soon as the magic number check fails.

Useful when investigating debuginfo packaging problems like this one:

[root@emilia ~]# perf buildid-list -i /usr/lib/debug/lib/modules/`uname -r`/vmlinux
77bb4ea591a602d455ace759a377c9adfff1aba3
[root@emilia ~]# perf buildid-list -k
07b0c016a2b30004e86132d0239945b1e88f5d75
[root@emilia ~]#

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-4elot9oxwa0rr0d90dshca3a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 16:41:37 -03:00
Arnaldo Carvalho de Melo
f2add9cd66 perf buildid-list: Add option to show the running kernel build id
[root@emilia ~]# perf buildid-list -k
07b0c016a2b30004e86132d0239945b1e88f5d75

Useful when diagnosing build id problems in debuginfo packages, etc.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-po1bl7acn6e1hhne90opmvtl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 16:41:37 -03:00
Neil Horman
63e03724b5 perf script: Add drop monitor script
A while back I created the dropmonitor protocol, which allowed users to get
reports of dropped frames communicated to them via a netlink socket.

While useful, several people have now asked that I integrate the ability
to do drop monitoring with perf, so they don't have to run additional
tools.

This patch adds a drop monitor script to the perf suite, and provides
the same output that the netlink socket does.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1309801217-22450-1-git-send-email-nhorman@tuxdriver.com
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 16:41:37 -03:00
Arnaldo Carvalho de Melo
98dfd55d80 perf symbols: Stop using 'self' in map_groups__ methods
Stop using this python/OOP convention, doesn't really helps. Will do
more from time to time till we get it cleaned up in all of /perf.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-rl9e690y60vnuyng05yp1zd3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 16:41:36 -03:00
David S. Miller
3869f80605 Merge git://github.com/Jkirsher/net-next 2011-09-29 15:41:20 -04:00
Oliver Hartkopp
12d0d0d3a7 can bcm: fix incomplete tx_setup fix
The commit aabdcb0b55 ("can bcm: fix tx_setup
off-by-one errors") fixed only a part of the original problem reported by
Andre Naujoks. It turned out that the original code needed to be re-ordered
to reduce complexity and to finally fix the reported frame counting issues.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-29 15:33:47 -04:00
Oliver Neukum
c510eae377 btusb: add device entry for Broadcom SoftSailing
From 0cea73465cd22373c5cd43a3edd25fbd4bb532ef Mon Sep 17 00:00:00 2001
From: Oliver Neukum <oliver@neukum.org>
Date: Wed, 21 Sep 2011 11:37:15 +0200
Subject: [PATCH] btusb: add device entry for Broadcom SoftSailing

This device declares itself to be vendor specific
It therefore needs to be added to the device table
to make btusb bind.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-09-29 16:31:15 -03:00
Jiri Olsa
8e303f20f4 perf tools: Fix raw sample reading
Wrong pointer is being passed for raw data sanity checking, when parsing
sample event.

This ends up with invalid event and perf record being stuck in
__perf_session__process_events function during processing build IDs
(process_buildids function).

Following command hangs up in my setup:
	./perf record -e raw_syscalls:sys_enter ls

The fix is to use proper pointer to the raw data instead of the 'u'
union.

Reviewed-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1317308709-9474-2-git-send-email-jolsa@redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-09-29 16:29:53 -03:00
David S. Miller
f4142cba4e sparc64: Force the execute bit in OpenFirmware's translation entries.
In the OF 'translations' property, the template TTEs in the mappings
never specify the executable bit.  This is the case even though some
of these mappings are for OF's code segment.

Therefore, we need to force the execute bit on in every mapping.

This problem can only really trigger on Niagara/sun4v machines and the
history behind this is a little complicated.

Previous to sun4v, the sun4u TTE entries lacked a hardware execute
permission bit.  So OF didn't have to ever worry about setting
anything to handle executable pages.  Any valid TTE loaded into the
I-TLB would be respected by the chip.

But sun4v Niagara chips have a real hardware enforced executable bit
in their TTEs.  So it has to be set or else the I-TLB throws an
instruction access exception with type code 6 (protection violation).

We've been extremely fortunate to not get bitten by this in the past.

The best I can tell is that the OF's mappings for it's executable code
were mapped using permanent locked mappings on sun4v in the past.
Therefore, the fact that we didn't have the exec bit set in the OF
translations we would use did not matter in practice.

Thanks to Greg Onufer for helping me track this down.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-29 12:18:59 -07:00
David Vrabel
4dcaebbf65 xen: use generic functions instead of xen_{alloc, free}_vm_area()
Replace calls to the Xen-specific xen_alloc_vm_area() and
xen_free_vm_area() functions with the generic equivalent
(alloc_vm_area() and free_vm_area()).

On x86, these were identical already.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29 15:02:18 -04:00
Jonathan Lallinger
85a6488949 RDSRDMA: Fix cleanup of rds_iw_mr_pool
In the rds_iw_mr_pool struct the free_pinned field keeps track of
memory pinned by free MRs. While this field is incremented properly
upon allocation, it is never decremented upon unmapping. This would
cause the rds_rdma module to crash the kernel upon unloading, by
triggering the BUG_ON in the rds_iw_destroy_mr_pool function.

This change keeps track of the MRs that become unpinned, so that
free_pinned can be decremented appropriately.

Signed-off-by: Jonathan Lallinger <jonathan@ogc.us>
Signed-off-by: Steve Wise <swise@ogc.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-29 14:57:19 -04:00
Roy.Li
605b91c8f6 net: Documentation: Fix type of variables
Signed-off-by: Roy.Li <rongqing.li@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-29 14:57:19 -04:00
Michael Riesch
f9b491ecc4 usbnet: add timestamping support
In order to make USB-to-Ethernet-adapters (depending on usbnet) support
timestamping, the "skb_defer_rx_timestamp" and "skb_tx_timestamp" function
calls are added.

Signed-off-by: Michael Riesch <michael@riesch.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-29 14:46:49 -04:00
Xiao Jiang
7f5c6addcd net/fec: add poll controller function for fec nic
Add poll controller function for fec nic.

Signed-off-by: Xiao Jiang <jgq516@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-29 14:46:49 -04:00
Xiao Jiang
c7c83d1c95 net/fec: replace hardcoded irq num with macro
Don't use hardcoded irq num and replace it with
FEC_IRQ_NUM macro.

Signed-off-by: Xiao Jiang <jgq516@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-29 14:46:49 -04:00
Andre Guedes
e95beb4141 Bluetooth: hci_le_adv_report_evt code refactoring
There is no reason to treat the first advertising entry differently
from the potential other ones. Besides, the current implementation
can easily leads to typos.

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-09-29 15:25:08 -03:00
Waldemar Rymarkiewicz
b6f98044a6 Bluetooth: Fix possible NULL pointer dereference
Checking conn->pending_sec_level if there is no connection leads to potential
null pointer dereference. Don't process pin_code_request_event at all if no
connection exists.

Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@gmail.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
2011-09-29 15:23:58 -03:00
Ingo Molnar
4167ab90ee Merge branch 'core' of git://amd64.org/linux/rric into perf/core 2011-09-29 17:35:29 +02:00
David Vrabel
f3f436e33b xen: release all pages within 1-1 p2m mappings
In xen_memory_setup() all reserved regions and gaps are set to an
identity (1-1) p2m mapping.  If an available page has a PFN within one
of these 1-1 mappings it will become inaccessible (as it MFN is lost)
so release them before setting up the mapping.

This can make an additional 256 MiB or more of RAM available
(depending on the size of the reserved regions in the memory map) if
the initial pages overlap with reserved regions.

The 1:1 p2m mappings are also extended to cover partial pages.  This
fixes an issue with (for example) systems with a BIOS that puts the
DMI tables in a reserved region that begins on a non-page boundary.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29 11:12:15 -04:00
David Vrabel
dc91c728fd xen: allow extra memory to be in multiple regions
Allow the extra memory (used by the balloon driver) to be in multiple
regions (typically two regions, one for low memory and one for high
memory).  This allows the balloon driver to increase the number of
available low pages (if the initial number if pages is small).

As a side effect, the algorithm for building the e820 memory map is
simpler and more obviously correct as the map supplied by the
hypervisor is (almost) used as is (in particular, all reserved regions
and gaps are preserved).  Only RAM regions are altered and RAM regions
above max_pfn + extra_pages are marked as unused (the region is split
in two if necessary).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29 11:12:10 -04:00
David Vrabel
8b5d44a5ac xen: allow balloon driver to use more than one memory region
Allow the xen balloon driver to populate its list of extra pages from
more than one region of memory.  This will allow platforms to provide
(for example) a region of low memory and a region of high memory.

The maximum possible number of extra regions is 128 (== E820MAX) which
is quite large so xen_extra_mem is placed in __initdata.  This is safe
as both xen_memory_setup() and balloon_init() are in __init.

The balloon regions themselves are not altered (i.e., there is still
only the one region).

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29 11:12:10 -04:00
David Vrabel
b1cbf9b1d6 xen/balloon: simplify test for the end of usable RAM
When initializing the balloon only max_pfn needs to be checked
(max_pfn will always be <= e820_end_of_ram_pfn()) and improve the
confusing comment.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29 11:12:09 -04:00
David Vrabel
aa24411b67 xen/balloon: account for pages released during memory setup
In xen_memory_setup() pages that occur in gaps in the memory map are
released back to Xen.  This reduces the domain's current page count in
the hypervisor.  The Xen balloon driver does not correctly decrease
its initial current_pages count to reflect this.  If 'delta' pages are
released and the target is adjusted the resulting reservation is
always 'delta' less than the requested target.

This affects dom0 if the initial allocation of pages overlaps the PCI
memory region but won't affect most domU guests that have been setup
with pseudo-physical memory maps that don't have gaps.

Fix this by accouting for the released pages when starting the balloon
driver.

If the domain's targets are managed by xapi, the domain may eventually
run out of memory and die because xapi currently gets its target
calculations wrong and whenever it is restarted it always reduces the
target by 'delta'.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29 11:12:09 -04:00
Stefano Stabellini
5fbdc10395 xen: remove XEN_PLATFORM_PCI config option
Xen PVHVM needs xen-platform-pci, on the other hand xen-platform-pci is
useless in any other cases.
Therefore remove the XEN_PLATFORM_PCI config option and compile
xen-platform-pci built-in if XEN_PVHVM is selected.

Changes to v1:

- remove xen-platform-pci.o and just use platform-pci.o since it is not
externally visible anymore.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29 10:52:16 -04:00
Stefano Stabellini
b17d0b5c08 xen: XEN_PVHVM depends on PCI
Xen PV on HVM guests require PCI support because they need the
xen-platform-pci driver in order to initialize xenbus.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29 10:52:16 -04:00
Dan Carpenter
e1db4cef89 xen/pciback: double lock typo
We called mutex_lock() twice instead of unlocking.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29 10:50:26 -04:00
Stefano Stabellini
0930bba674 xen: modify kernel mappings corresponding to granted pages
If we want to use granted pages for AIO, changing the mappings of a user
vma and the corresponding p2m is not enough, we also need to update the
kernel mappings accordingly.
Currently this is only needed for pages that are created for user usages
through /dev/xen/gntdev. As in, pages that have been in use by the
kernel and use the P2M will not need this special mapping.
However there are no guarantees that in the future the kernel won't
start accessing pages through the 1:1 even for internal usage.

In order to avoid the complexity of dealing with highmem, we allocated
the pages lowmem.
We issue a HYPERVISOR_grant_table_op right away in
m2p_add_override and we remove the mappings using another
HYPERVISOR_grant_table_op in m2p_remove_override.
Considering that m2p_add_override and m2p_remove_override are called
once per page we use multicalls and hypercall batching.

Use the kmap_op pointer directly as argument to do the mapping as it is
guaranteed to be present up until the unmapping is done.
Before issuing any unmapping multicalls, we need to make sure that the
mapping has already being done, because we need the kmap->handle to be
set correctly.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
[v1: Removed GRANT_FRAME_BIT usage]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29 10:32:58 -04:00
Stefano Stabellini
693394b8c3 xen: add an "highmem" parameter to alloc_xenballooned_pages
Add an highmem parameter to alloc_xenballooned_pages, to allow callers to
request lowmem or highmem pages.

Fix the code style of free_xenballooned_pages' prototype.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29 09:56:52 -04:00
Jan Schmidt
5da6fcbc4e btrfs: integrating raid-repair and scrub-fixup-nodatasum
This ties nodatasum fixup in scrub together with raid repair patches. While
both series are working fine alone, scrub will report uncorrectable errors
if they occur in a nodatasum extent *and* the page is in the page cache.

Previously, we would have triggered readpage to find good data and do the
repair. However, readpage wouldn't read anything in the case where the page
is up to date in the cache. So, we simply take that good data we have and
call repair_io_failure directly (unless the page in the cache is dirty).

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2011-09-29 13:38:43 +02:00
Jan Schmidt
4a54c8c165 btrfs: Moved repair code from inode.c to extent_io.c
The raid-retry code in inode.c can be generalized so that it works for
metadata as well. Thus, this patch moves it to extent_io.c and makes the
raid-retry code a raid-repair code.

Repair works that way: Whenever a read error occurs and we have more
mirrors to try, note the failed mirror, and retry another. If we find a
good one, check if we did note a failure earlier and if so, do not allow
the read to complete until after the bad sector was written with the good
data we just fetched. As we have the extent locked while reading, no one
can change the data in between.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2011-09-29 13:38:42 +02:00
Jan Schmidt
2774b2ca3d btrfs: Put mirror_num in bi_bdev
The error correction code wants to make sure that only the bad mirror is
rewritten. Thus, we need to know which mirror is the bad one. I did not
find a more apropriate field than bi_bdev. But I think using this is fine,
because it is modified by the block layer, anyway, and should not be read
after the bio returned.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2011-09-29 13:38:42 +02:00
Jan Schmidt
1503140d3e btrfs: Do not use bio->bi_bdev after submission
The block layer modifies bio->bi_bdev and bio->bi_sector while working on
the bio, they do _not_ come back unmodified in the completion callback.

To call add_page, we need at least some bi_bdev set, which is why the code
was working, previously. With this patch, we use the latest_bdev from
fsinfo instead of the leftover in the bio. This gives us the possibility to
use the bi_bdev field for another purpose.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2011-09-29 13:38:42 +02:00
Jan Schmidt
a1d3c4786a btrfs: btrfs_multi_bio replaced with btrfs_bio
btrfs_bio is a bio abstraction able to split and not complete after the last
bio has returned (like the old btrfs_multi_bio). Additionally, btrfs_bio
tracks the mirror_num used to read data which can be used for error
correction purposes.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2011-09-29 13:38:42 +02:00
Jan Schmidt
d7728c960d btrfs: new ioctls to do logical->inode and inode->path resolving
these ioctls make use of the new functions initially added for scrub. they
return all inodes belonging to a logical address (BTRFS_IOC_LOGICAL_INO) and
all paths belonging to an inode (BTRFS_IOC_INO_PATHS).

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2011-09-29 12:54:28 +02:00
Jan Schmidt
0ef8e45158 btrfs scrub: add fixup code for errors on nodatasum files
This removes a FIXME comment and introduces the first part of nodatasum
fixup: It gets the corresponding inode for a logical address and triggers a
regular readpage for the corrupted sector.

Once we have on-the-fly error correction our error will be automatically
corrected. The correction code is expected to clear the newly introduced
EXTENT_DAMAGED flag, making scrub report that error as "corrected" instead
of "uncorrectable" eventually.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2011-09-29 12:54:28 +02:00